SYSTEMS AND METHODS FOR CREATING, MODIFYING, 
INTERACTING WITH AND PLAYING MUSICAL COMPOSITIONS 



Field of the Invention 

The present invention relates to systems and methods for creating, modifying, interacting 
with and playing music, and more particularly to systems and methods employing a top-down 
and interactive auto-composition process, where the systems/methods provide the user with a 
musical composition that may be modified and interacted with and played and/or stored (for later 
play) in order to create music that is desired by the particular user. 
Background of the Invention 

A large number of distinct musical styles have emerged over the years, as have systems 
and technologies for creating, storing, and playing back music in accordance with such styles. 
Music creation, particularly of any quality, typically has been limited to persons who have 
musical training or who have expended the time and energy required to learn and play one or 
more instruments. Systems for creating and storing quality musical compositions have tended 
towards technologies that utilize significant computer processing and/or data storage. More 
recent examples of such technologies include compact disc (CD) audio players and players of 
compressed files (for instance as per the MPEG-level 3 standard), etc. Finally, there exist 
devices incorporating a tuner, which permit reception of radio broadcasts via electromagnetic 
waves, such as FM or AM radio receivers. 

Electronics and computer-related technologies have been increasingly applied to musical 
instruments over the years. Musical synthesizers and other instruments of increasing complexity 
and musical sophistication and quality have been developed, a "language" for conversation 
between such instruments has been created, which is known as the MIDI (Musical Instrument 
Digital Interface) standard. While MIDI-compatible instruments and computer technologies have 
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had a great impact on the ability to create and playback or store music, such systems still tend to 
require substantial musical training or experience, and tend to be complex and expensive. 

Accordingly, it is an object of the present invention to provide systems and methods for 
creating, modifying, interacting with and/or playing music employing a top-down process, where 
the systems/methods provide the user with a musical composition that may be modified and 
interacted with and played and/or stored (for later play) in order to create music that is desired by 
the particular user. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music that enables a user to quickly begin creating 
desirable music in accordance with one or a variety of musical styles, with the user modifying an 
auto-composed or previously created musical composition, either for a real time performance 
and/or for storing and subsequent playback. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music in which a graphical interface is provided to 
facilitate use of the system and increase user enjoyment of the system by having graphic 
information presented in a manner that corresponds with the music being heard or aspects of the 
music that are being modified or the like; it also is an object of the present invention to make 
such graphic information customizable by a user. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music in which a graphical interface is provided that 
presents a representation of a plurality of musical lanes, below each of which is represented a 
tunnel, in which a user may modify musical parameters, samples or other attributes of the 
musical composition, with such modifications preferably being accompanied by a change in a 
visual effect. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music in which music may be represented in a form to 
be readily modified or used in an auto-composition algorithm or the like, and which presents 
reduced processing and/or storage requirements as compared to certain conventional audio 
storage techniques. 
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It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music in which music may be automatically 
composed in a variety of distinct musical styles, where a user may interact with auto-composed 
music to create new music of the particular musical style, where the system controls which 
5 parameters may be modified by the user, and the range in which such parameters may be changed 
by the user, consistent with the particular musical style. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music based on efficient song structures and ways to 
represent songs, which may incorporate or utilize pseudo-random/random events in the creation 
40 of musical compositions based on such song structures and ways to represent songs, 
y It is another object of the present invention to provide systems and methods for creating, 

III modifying, interacting with and/or playing music in which songs may be efficiently created, 
3 stored and/processed; preferably songs are represented in a form such that a relatively small 
pj amount of data storage is required to store the song, and thus songs may be stored using 
si 5 relatively little data storage capacity or a large number of songs may be stored in a given data 
p storage capacity, and songs may be transmitted such as via the Internet using relatively little data 
? ==f transmission bandwidth. 

S lt is another object of the present invention to provide systems and methods for creating, 

modifying, interacting with and/or playing music in which a modified MIDI representation of 

20 music is employed, preferably, for example, in which musical rule information is embedded in 
MIDI pitch data, musical rules are applied in a manner that utilize relative rhythmic density and 
relative mobility of note pitch, and in which sound samples may be synchronized with MIDI 
events in a desirable and more optimum manner. 

It is another object of the present invention to provide systems and methods for creating, 

25 modifying, interacting with and/or playing music in which a hardware/software system preferably 
includes a radio tuner so that output from the radio tuner may be mixed, for example, with auto- 
composed songs created by the system, which preferably includes a virtual radio mode of 
operation; it also is an object of the present invention to provide hardware that utilizes non- 
volatile storage media to store songs, song lists and configuration information, and hardware that 
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facilitates the storing and sharing of songs and song lists and the updating of sound banks and the 
like that are used to create musical compositions. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music that works in conjunction with a companion PC 
software program that enables users to utilize the resources of a companion PC and/or to easily 
update and/or share Play lists, components of songs, songs, samples, etc. 

It is another object of the present invention to provide systems and methods for creating, 
modifying, interacting with and/or playing music in which songs may be generated, exchanged 
and disseminated, preferably or potentially on a royalty free basis. 

Finally, it is an object of the present invention to provide systems and methods for 
S creatin & modifying, interacting with and/or playing music that may be adapted to a variety of 
hj applications, systems and processes in which such music creation may be utilized. 
Summary of the Invention 

" 

1 U The present invention addresses such problems and limitations and provides systems and 

Js methods that may achieve such objects by providing hardware, software, musical composition 
algorithms and a user interface and the like (as hereinafter described in detail) in which users 
may readily create, modify, interact with and play music. In a preferred embodiment, the system 
is provided in a handheld form factor, much like a video or electronic game. A graphical display 
is provided to display status information, graphical representations of musical lanes or 
components, which preferably vary in shape, color or other visual attribute as musical parameters 
and the like are changed for particular instruments or musical components such as a microphone 
input, samples, etc. The system preferably operates in a variety of modes such that users may 
create, modify, interact with and play music of a desired style, including an electronic DJ ("e- 
DJ") mode, a virtual radio mode, a song/song list playback mode, sample create/playback mode 
25 and a system mode, all of which will be described in greater detail hereinafter. 

Preferred embodiments employ a top-down process, where the system provides the user 
with in effect a complete musical composition, basically a song, that may be modified and 
interacted with and played and/or stored (for later play) in order to create music that is desired by 
the particular user. Utilizing an auto-composition process employing musical rules and 
30 preferably a pseudo random number generator, which may also incorporate randomness 
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introduced by timing of user input or the like, the user may then quickly begin creating desirable 
music in accordance with one or a variety of musical styles, with the user modifying the auto- 
composed (or previously created) musical composition, either for a real time performance and/or 
for storing and subsequent playback. 

A graphical interface preferably is provided to facilitate use of the system and increase 
user enjoyment of the system by having graphic information presented in a manner that 
corresponds with the music being heard or aspects of the music that are being modified or the 
like. An LCD display preferably is used to provide the graphical user interface, although an 
external video monitor or other display may be used as an addition or an alternative. In preferred 
embodiments, such graphic information is customizable by a user, such as by way of a 
companion software program, which preferably runs on a PC and is coupled to the system via an 
interface such as a USB port. For example, the companion software program may provide 
templates or sample graphics that the user may select and/or modify to customize the graphics 
displayed on the display, which may be selected and/or modified to suit the particular user's 
preferences or may be selected to correspond in some manner to the style of music being played. 
In one embodiment, the companion software program provides one or more templates or sample 
graphics sets, wherein the particular template(s) or sample graphic set(s) correspond to a 
particular style of music. With such embodiments, the graphics may be customized to more 
closely correspond to the particular style of music being created or played and/or to the personal 
preferences of the user. 

The graphical interface preferably presents, in at least one mode of operation, a visual 
representation of a plurality of musical lanes or paths corresponding to components (such as 
particular instruments, samples or microphone input, etc.). In addition to allowing the user to 
visualize the various components of the musical composition, through user input (such as 
through a joystick movement) the user may go into a particular lane, which preferably is 
represented visually by a representation of a tunnel. When inside of a particular tunnel, a user 
may modify musical parameters, samples or other attributes of the musical composition, with 
such modifications preferably being accompanied by a change in a visual effect that accompany 
the tunnel. 
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In accordance with preferred embodiments, music may be automatically composed in a 
variety of distinct musical styles. The user preferably is presented with a variety of pre-set 
musical styles, which the user may select. As a particular example, in e-DJ mode, the user may 
select a particular style from a collection of styles (as will be explained hereinafter, styles maybe 
5 arranged as "style mixes" and within a particular style mix one or more particular styles, and 
optionally substyles or "microstyles." After selection of a particular style or substyle, with a 
preferably single button push (e.g., play) the system begins automatically composing music in 
accordance with the particular selected style or substyle. Thereafter, the user may interact with 
the auto-composed music of the selected style/substyle to modify parameters of the particular 
^10 music (such as via entering a tunnel for a particular component of the music), and via such 
Q modifications create new music of the particular musical style/substyle. In order to facilitate the 
l=j creation of music of a desirable quality consistent with the selected style/substyle, the system 
preferably controls which parameters may be modified by the user, and the range over which 
pi such parameters may be changed by the user, consistent with the particular musical 
1 5 style/ substyle. The system preferably accomplishes this via music that may be represented in a 
Z form t0 be readily modified or used in an auto-composition algorithm or the like. The musical 

data representation, and accompanying rules for processing the musical data, enable music to be 
p auto-composed and interacted with in a manner that presents reduced processing and/or storage 
requirements as compared to certain conventional audio storage techniques (such as CD audio, 
20 MP3 files, WAV files, etc.). 

In accordance with certain embodiments, the system operates based on efficient song 
structures and ways to represent songs, which may incorporate or utilize pseudo-random/random 
events in the creation of musical compositions based on such song structures and ways to 
represent songs. Songs may be efficiently created, stored and/processed, and preferably songs are 
25 represented in a form such that a relatively small amount of data storage is required to store the 
song. Songs may be stored using relatively little data storage capacity or a large number of songs 
may be stored in a given data storage capacity, and songs may be transmitted such as via the 
Internet using relatively little data transmission bandwidth. In preferred embodiments, a 
modified MIDI representation of music is employed, preferably, for example, in which musical 
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rule information is embedded in MIDI pitch data, and in which sound samples may be 
synchronized with MIDI events in a desirable and more optimum manner. 

The system architecture of preferred embodiments includes a microprocessor or 
microcontroller for controlling the overall system operation. A synthesizer/DSP is provided in 
5 certain embodiments in order to generate audio streams (music and audio samples, etc.). Non- 
volatile memory preferably is provided for storing sound banks. Preferably removable non- 
volatile storage/memory preferably is provided to store configuration files, song lists and 
samples, and in certain embodiments sound bank optimization or sound bank data. A codec 
preferably is provided for receiving microphone input and for providing audio output. A radio 
£0 tuner preferably is provided so that output from the radio tuner may be mixed, for example, with 
jjj auto-composed songs created by the system, which preferably includes a virtual radio mode of 
|y operation. The system also preferably includes hardware and associated software that facilitates 
| the storin g md sharing of songs and song lists and the updating of sound banks and the like that 
|*( are used to create musical compositions. 

j5 In alternative embodiments, the hardware, software, musical data structures and/or user 

|I interface attributes are adapted to, and employed in, a variety of applications, systems and 

processes in which such music creation may be utilized. 
O Such aspects of the present invention will be understood based on the detailed description 

to follow hereinafter. 
20 Brief Description of the Drawings 

The above objects and other advantages of the present invention will become more 

apparent by describing in detail the preferred embodiments of the present invention with 

reference to the attached drawings in which: 

Fig. 1 illustrates an exemplary preferred embodiment of a "Player" in accordance with the 
25 present invention; 

Figs. 2-3 illustrate exemplary preferred function and mode keys in accordance with the 
present invention; 

Figs. 4-13 illustrate exemplary preferred screens of the graphical user interface in 
accordance with the present invention; 



7 



Fig. 14 is a table illustrating exemplary configuration parameters used in accordance with 
certain preferred embodiments of the present invention; 

Fig. 15 illustrates the song structure used in certain preferred embodiments of the present 
invention; 

5 Fig. 16 illustrates an exemplary preferred musical generation flow utilized in certain 

preferred embodiments of the present invention; 

Fig. 17 is a table illustrating exemplary virtual notes/controllers utilized in certain 
preferred embodiments of the present invention; 

Fig. 18 is a diagram illustrating Tessitura principles utilized in accordance with certain 
1 0 embodiments of the present invention; 

Fig. 19 illustrates principles of encoding musical key changes preferably as offsets, which 
is utilized in accordance with preferred embodiments of the present invention; 

Fig. 20 illustrates a mode application musical rule that preferably is part of the overall 
process in accordance with preferred embodiments of the present invention; 

Fig. 21 illustrates an exemplary preferred virtual pattern to real pattern flow utilized in 
preferred embodiments of the present invention; 

Fig. 22 illustrates principles of relative rhythmic density utilized in accordance with 
certain embodiments of the present invention; 

Fig. 23 illustrates principles of the relative mobility of note pitch utilized in accordance 
with certain embodiments of the present invention; 

Fig. 24 illustrates a pattern structure creation example in accordance with certain 
embodiments of the present invention; 

Fig. 25 illustrates a block structure creation example in accordance with certain 
embodiments of the present invention; 

Figs. 26-27 illustrate Pseudo-Random Number generation examples utilized in certain 
preferred embodiments of the present invention; 

Fig. 28 illustrates attributes of simple data structures utilized in accordance with certain 
preferred embodiments of the present invention; 

Fig. 29 illustrates an exemplary simple data structure flow in accordance with certain 
preferred embodiments of the present invention; 
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Fig. 30 illustrates attributes of complex data structures utilized in accordance with certain 
preferred embodiments of the present invention; 

Fig. 3 1 illustrates an exemplary complex data structure flow in accordance with certain 
preferred embodiments of the present invention; 

Figs. 32-34 illustrate exemplary hardware configurations of certain preferred 
embodiments of the player and a docking station in accordance with the present invention; 

Fig. 35 illustrates an exemplary address map for the microprocessor utilized in 
accordance with certain preferred embodiments of the present invention; 

Fig. 36 illustrates an exemplary address map for the synthesizer/DSP utilized in 
accordance with certain preferred embodiments of the present invention; 

Figs. 37-38 illustrate the use of a DSP bootstrap/addressing technique utilized in 
accordance with certain preferred embodiments of the present invention; 

Fig. 39 illustrates a simplified logical arrangement of MIDI and audio streams in the 
music generation process for purposes of understanding preferred embodiments of the present 
invention; 

Fig. 40 illustrates a simplified MIDI and audio stream timeline for purposes of 
understanding preferred embodiments of the present invention; and 

Figs. 41-42 illustrate the use of Non-Registered Parameter Number for purposes of 
synchronizing MIDA events and audio samples in accordance with certain preferred 
embodiments of the present invention. 
Detailed Description of Exemplary Preferred Embodiments 

The present invention will be described in greater detail with reference to certain 
preferred and certain other embodiments, which may serve to further the understanding of 
preferred embodiments of the present invention. As described elsewhere herein, various 
refinements and substitutions of the various elements of the various embodiments are possible 
based on the principles and teachings herein. 

In accordance with the present invention, music may be created (including by auto- 
composition), interacted with, played and implemented in a variety of novel ways as will be 
hereinafter described via numerous exemplary preferred and alternative embodiments. Included 
in such embodiments are what may be considered as top-down approaches to musical creation. 



Top-down as used herein generally means that a complete song structure for quality music is 
created for the end user as a starting point. This enables the user to immediately be in position to 
create quality music, with the user then having the ability to alter, and thereby create new music, 
based on the starting point provided by the system. Where a particular user takes the music 
creation process is up to them. More conventional musical creation processes involve a bottom- 
up approach, wherein the rudiments of each instrument and musical Style are learned, and then 
individual notes are put together, etc. This conventional approach generally has the side-effect of 
limiting the musical creation to a small group of trained people, and has, in effect, barred the 
wider population from experiencing the creative process with music. 

A useful analogy for purposes of understanding embodiments of the present invention is 
that of building a house. In the conventional means of house-building, the user is given a bunch 
of bricks, nails, wood, and paint. If you want a house, you need to either learn all the intricacies 
of how to work with each of these materials, as well as electrical wiring, plumbing, engineering, 
etc., or you need to find people who are trained in these areas. Similarly, in musical creation, if 
you want a song (that is pleasing), you need to learn all about various types of musical 
instruments (and each of their unique specialties or constraints), as well as a decent amount of 
music theory, and acquire a familiarity with specific techniques and characteristics in a given 
Style of music (such as techno, jazz, hip-hop, etc.). 

It would, of course, be far more convenient if, when someone wanted a house, they were 
given a complete house that they could then easily modify (with the press of a button). For 
example, they could walk into the kitchen and instantly change it to be larger, or a different color, 
or with additional windows. And they could walk into the bathroom and raise the ceiling, put in 
a hot tub, etc. They could walk into the living room and try different paint schemes, or different 
furniture Styles, etc. Similarly, in accordance with embodiments of the present invention, the 
user desirably is provided with a complete song to begin with, they can then easily modify, at 
various levels from general to specific, to create a song that is unique and in accordance with the 
user's desires, tastes and preferences. 

In accordance with the present invention, the general population of people readily may be 
provided with an easy approach to musical creation. It allows them the immediate gratification 
of a complete song, while still allowing them to compose original music. This top down 
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approach to musical creation opens the world of musical creativity to a larger group of people by 
reducing the barriers to creating pleasurable music. 

In accordance with the present invention, various systems and methods are provided that 
enable users to create music. Such systems and methods desirably utilize intuitive and easy to 
5 learn and use user interfaces that facilitate the creation of, and interaction with, music that is 
being created, or was created previously. Various aspects of one example of a preferred 
embodiment for a user interface in accordance with certain preferred embodiments of the present 
invention will now be described. 

In accordance with such preferred embodiments of the present invention, user interface 
J.0 features are provided that desirably facilitate the interactive generation of music. The discussion 
J*f of such preferred embodiments to be herein after provided are primarily focused on one example 
|y of a handheld, entry-level type of device, herein called 'Player ' . However, many of the novel and 
7: inventive features discussed in connection with such a Player relate to the visual enhancement of 
pj the control and architecture of the music generation process; accordingly they can apply to other 
15 types of devices, such as computing devices, web server/websites, kiosks, video, or other 
| s [ electronic games and other entertainment devices that allow music creation and interaction, and 
J thus also may benefit from such aspects of the present invention. A discussion of certain of the 
Cl other types of devices is provided hereinafter. As will be appreciated by one of ordinary skill in 

the art, various features of the user interface of the Player can be understood to apply to such a 
20 broader range of devices. 

Generally, the goal of the user interface is to allow intuitive, simple operation of the 
system and interaction with various parameters with a minimum number of buttons, while at the 
same time preserving the power of the system. Fig. 1 illustrates an exemplary system 
configuration for Player 10. Display 20 provides visual information to the user, as will 
25 hereinafter be described. Various mode keys 16 provide buttons that enable a user to directly 
access, or initiation, modes of operation of the system as will be hereinafter described. Joystick 
15 is provided to enable the user to select or interact with various musical or system parameters 
or the like, as will be hereinafter described. Save/edit key 17 preferably is provided to save songs 
or parameter changes, etc., that a user may have created or made using the system, and also to 
30 initiate editing of parameters, Play lists, samples, etc., such as will be described hereinafter. 
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Volume key(s) 14 is/are provided, either in dual button up/down form or a single knob or dial to 
enable the output volume level to be adjusted. Function keys 1 1 preferably are provided to 
enable player functions such as play (ok), stop (cancel), forward (insert/create), reverse (delete) 
and record, exemplary uses of which will be described in greater detail hereinafter. FX key 12 
5 preferably is provided to enable a user to easily and intuitively adjust one or more audio effects 
(e.g., doppler, reverb, wobbler, custom, etc.) of a part of the music (e.g., a particular sample 
sound); one preferred way to enable an intuitive sound effect selection by the user is to enable to 
FX key 12 to be used in combination with the Joystick 15 left and right controls, a corresponding 
preferred way to enable intuitive sound effect adjustment (e.g., increase or decrease the effect of 
the selected sound effect) is to enable to the FX Key 12 to be used in combination with the 
Joystick 15 up and down controls. Pitch/tempo key 13 preferably is provided to enable single 

|d button activation for pitch/tempo changes (preferably along with joystick movements), as will be 
hereinafter described in greater detail. On/off button 18 preferably is provided to turn on or off 

jjj j the player, and preferably a brief depression/toggle can be used to turn on/off an LCD backlight, 

si 5 although, for example, other turn off modes may be used as well (such as a time out turn off, 

12 when the player is not playing and there has been no activity detected for a predetermined time 
out period, etc. Exemplary desirable uses of such buttons and keys provided in the illustrative 

O Player 10 embodiment will become more apparent based on the discussion to follow. 

In accordance with preferred embodiments, a Home mode is provided. Home mode is a 

20 default mode that can be automatically entered when Player 10 is turned on. As the example of 
Fig. 4 shows, Home mode preferably displays an animated screen prompting the user to select a 
mode by pressing a direct access mode key 16 or entering help mode by pressing the joystick 
(Fig. 4 depicts the moment of the animation that prompts for the Radio direct access key). In 
preferred embodiments, a user can define the graphics displayed on the display 20 using, for 

25 example, a companion PC software program (discussed in greater detail below) to select graphics 
(animated or otherwise) to be automatically substituted (if available) for the default graphics 
during the different modes of operation. In this example of custom screens, data files 
corresponding to the customized screen graphics for each section of a song, and/or each mode of 
operation, preferably can be stored as part of the song data structure (discussed below) in a 

30 storage location of a removable memory means such as the Flash memory in a Smart Media Card 
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(SMC). In preferred embodiments, in Home mode the screen scrolls through various modes that 
are available in the system, such as modes associated with mode/direct access keys 16 (see, 
again, Fig. 1). Additionally, Player 10 preferably is configured to return to Home mode from the 
main menu of any other mode (i.e., from the user pressing the Stop key). When the joystick is 
pressed in Home mode, preferably a help screen is displayed prompting the user to press any key 
for help. An example help screen is shown in Fig. 5. In accordance with this example, when a 
key is pressed while Player 10 is displaying this screen, helpful text relating to that key is 
displayed. 

Play can be used when in Home mode to enter a particularly important visual interface 
mode referred to herein as the I- Way mode (discussed in greater detail below). As shown in the 
example of Fig. 6, the preferably LCD screen can display a message regarding other possible 
modes, such as "e.DJ Style", in the status line and propose a selection of music Styles/SubStyles 
(e.g.; Techno Mix, House, Garage, etc.). At this type of screen, to select a desired Style, a user 
can press Up or Down. In this example, Styles in uppercase preferably denote a category of 
SubStyles that are randomly chosen for each song, and SubStyles preferably are indicated by 
lowercase Styles proceeding each uppercase Style. Once the user selects a Style, to enter I- Way 
mode with the selected Style, the user can press Play. Once the I- Way mode is entered, 
preferably Player 10 automatically creates, and starts playing, a song in the chosen Style. 
Exemplary Styles/SubStyles that preferably are provided in accordance with certain preferred 
embodiments include: Coolmix (SubStyles ballad, bossa, new age); Hip Hop Mix (SubStyles hip 
hop, rap, R&B, downbeat, ragga); Kitsch; Techno Mix (SubStyles house, garage, trance, jungle); 
etc. What is important to note is that, in accordance with preferred embodiments, distinct music 
Styles are determined, at least some of the musical Styles including distinct SubStyles, wherein 
characteristics of the particular Style and/or SubStyle result in different musical rules being 
applied to the automatic creation of music in accordance with the particular Style/SubStyle (the 
use of musical rules and other algorithmic and other details of the preferred music generation 
process is discussed in greater detail elsewhere herein), with an intuitive and easy to use interface 
provided to enable the ready creation and user modification of music in accordance with the 
particular Style/SubStyle, etc. In additional embodiments the use of an even finer gradation of 
musical aesthetic is available to the user in the form of a MicroStyle. For example, a plurality of 
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MicroStyles are provided that all generally conform to a particular SubStyle, while the SubStyle 
is accompanied by one or more other SubStyles that together generally conform to a particular 
Style. This third tier of musical granularity preferably gives the discerning user even finer 
control over the musical output of the algorithmic music. Such MicroStyles preferably provide 
more consistent music, while perhaps losing some of the flexibility of Styles/SubStyles. What is 
important is that the user is provided with a plurality of levels of musical style categorizations, 
where basically at each descending level the range of musical parameters that may be varied by 
the user and/or the auto-composition algorithm and the like are progressively more constrained, 
consistent with the particular Style, SubStyle or MicroStyle that is selected, etc. 

An important feature of Home mode is the ability to configure Player 10 to start playing 
music quickly and easily. This is because, although Player 10 is configured to be interactive, and 
many professional-grade features are available to adjust various aspects of the Style and sound, it 
is desirable to have a quick and easy way for users to use the Player in a 'press-it-and-forget-it' 
mode. Thus, with only very few button pushes, a user with little or no musical experience, or 
little or no experience with Player 10, may easily begin composing original music with Player 10 
of a desired Style or SubStyle. An additional preferred way to provide an auto-play type of 
capability is to use a removable storage memory medium (e.g., Smart Media Card) to store a Play 
list, such as a file containing a list of song data structures that are present on the removable 
memory. Following this example, when the user inserts the removable memory, or when the 
system is powered on with a removable memory already inserted, preferably the system will scan 
the removable memory to look for such a file containing a Play list and begin to play the song 
data structures that are listed in the system file. Preferably, this arrangement can be configured 
such that the Auto-Play mode is selectable (such as via a configuration setting in the system file), 
and that the system will wait a short duration before beginning Auto-Play, to allow the user an 
opportunity to enter a different mode on the system if so desired. 

As illustrated in Fig. 7, an exemplary, preferred screen for an I-Way mode depicts the 
front view of the user driving or moving down a visual representation of a highway or multi-lane 
road or path. Along the very top of the screen preferably is a status message that displays the 
current section or status of the ongoing eDJ session (for example: part 1, filtering drums, chorus, 
Part 2, «sample name», etc.). Preferably, other ways of displaying messages to the user to 
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more prominently indicate a status message can be used; for example, the system can 
momentarily flash a large visual indicator that takes up almost the entire screen. Preferably, 
directly in front of the field of view is a visual representation of a speaker that preferably is 
pulsing in time with the music being played. Preferably, each lane of the I- Way represents 
5 various types of elements of a song; such as instrument lanes (drums, bass, riff, lead), one or 
more sample lanes (to interact with pre-stored samples of voices, sounds, etc), and one or more 
microphone lanes which manage the microphone input in real-time. Other categories for lanes 
can be envisioned that are within the spirit and scope of the present invention. What is important 
to this aspect of the present invention that the user be presented with a multi-lane visual 
,10 representation that includes a plurality of lanes, each of which corresponds to a constituent 
Q component or effect, etc., of the music that is being composed or played. The user preferably 

WSJ? m t 

lil uses joystick 15 (for example, a circular button that can depress in 4 areas: top, bottom, left and 

J* right, such as illustrated in Fig. 1) to move the center of view around. Generally, each directional 
depression of joystick 15 causes the center of view to shift in the corresponding direction. For 

i 5 example, when in the left lane and the right joystick button is pressed, the center of view moves 
over one lane to the right. In alternative embodiments, additional layers of interactivity can be 

»j{ presented with additional horizontal layers of the I- Way. For example, when at the lane of the I- 

O Way for the drums (an instrument with distinct instrument components, such as snare, bass, floor 
torn, high hat, crash cymbal, ping-ride cymbal, roto-toms, etc.; orchestral percussion, such as 

20 tympani, gong, triangle, etc.), the user could press the down key to go down to another I- Way for 
the drums or other multiple component instrument, with a lane for each drum or component, 
and/or for different aspects of the drum or instrument sound. This concept of multiple I- Way 
interfaces can be selectively used for only the instruments that benefit from such an approach, 
such as the drums or other multiple component instrument (while other instruments maintain a 

25 single I-Way interface, etc.). The use of additional I- Way lanes is not necessary to enjoy all the 
benefits of the present invention, but is a desirable feature for certain uses of the invention, such 
as products geared for more professional uses, or for music Styles where additional user interface 
and instrument control complexity is desirable, such as classical music, or jazz. 

While in I- Way mode, the screen preferably is animated with sound waves or pulses 

30 synchronized with music beats. In the example of Fig. 7, a visual representation of a round 
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speaker is graphically represented in the center to symbolize the relative volume of the current 
lane. This graphic item preferably is configured to disappear, or be otherwise altered, when the 
lane is muted. It also can be configured to become bigger and smaller as the relative volume of 
that particular lane/section is adjusted (for example, by using a function key in combination with 
the joystick up and down buttons). Other simple variations are within the scope of the present 
invention, such as volume indicators visible in each lane at the same time, mute indications for 
each lane visible at the same time, graphic items in each lane visually reminiscent of the 
instrument represented by that lane, etc. 

In an auto composition mode such as the I- Way mode it is Player 10 itself preferably that 
decides about a song progression in that it can automatically add/remove instruments, do music 
breaks, drums progressions, chord progressions, filtering, modulation, play samples in sync with 
the music, select samples to play based on rules, etc., to end up sounding like in a real song on a 
CD or from the radio. After a few minutes, if nothing is done by the user, Player 10 preferably is 
configured to end the song, preferably with an automatic fade out of volume, and automatically 
compose and play a new song in the same Style, or alternatively a different Style. It also should 
be understood that I-Way mode also is applicable in preferred embodiments for music that is not 
auto-composed, such as a song that the user created/modified using Player 10 (which may have 
been created in part using auto-composition) and stored in Player 10 for subsequent playback, 
etc. 

In certain embodiments, newly composed patterns are numbered from 1 to n. This 
number can be displayed in the status line to help the user remember a music pattern he/she likes 
and come back to it after having tried a few other ones. In certain embodiments, this number 
might only be valid inside a given song and for the current interactive session. In other words, 
for example, the Riff pattern number 5 for the current song being composed would not sound like 
the Riff pattern number 5 composed in another song. However, if this song is saved as a user 
song, although the Riff music will be the same when replayed later, the number associated to it 
could be different. 

In one exemplary embodiment, Player 10 "remembers" up to 16 patterns previously 
composed during the current interactive session. This means, for example, that if the current 
pattern number displayed is 25, the user can listen to patterns from number 10 to 25 by browsing 
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forward through the previously composed patterns (patterns 1-9, in this embodiment, having 
been overwritten or otherwise discarded). If the User wants to skip a given composed pattern 
that is currently being played, he/she can, and the pattern number will not be incremented, 
meaning that currently played pattern will be lost. This feature can be used to store only specific 
5 patterns in the stack of previously played patterns, as desired by the user. What is important is 
that the user can create musical patterns, and selectively store (up to some predetermined number 
of musical patterns), with the stored patterns used to compose music that is determined by the 
user based on the user's particular tastes or desires, etc. The views presented by I- Way mode 
desirably facilitate this user creation and interaction with, and modification of, the music that is 
glj) be created/played by Player 1 0. 

y hi certain preferred embodiments, if desired by a user, additional music parameters of an 

S y instrument associated with a particular lane in the I- Way mode may be "viewed" and interacted 
J with by the user. For example, if a Down is pressed (such as by way of joystick 1 5) while in I- 
hJ Way mode, the center of view is taken "underground," to the "inside" of a particular lane. This 
15 transition to Underground mode preferably is made visually appealing by configuring a screen 
12 animation depicting the movement of the point of view down through the floor or bottom of the 
I- Way lane, into what appears to be a visual representation of a tunnel below a particular lane 

O that corresponds to the musical component represented by that lane. When inside the tunnel 

ill 

beneath a particular lane, a pulse indication (similar to the speaker pulse) preferably occurs in 
20 time with the tempo of the I- Way session. Furthermore, the left and right walls of the tunnel can 
be used to indicate the wave shape of the left and right sound channel outputs. 

The far end of the tunnel preferably is comprised of a shape (for example, a rectangle or 
other geometric) that can change in correlation to the value of one or more of the parameters 
affecting the sound of that particular lane. By way of example, in the case of drums, a filter 
25 parameter can be changed by depressing the function or Fx button (see, again Fig. 1), plus the 
joystick up or down button; at this time the shape comprising the end of the tunnel either changes 
shape or visually appears to get farther away or nearer. In another example, the pitch of a guitar 
can be adjusted by pressing the pitch key along with the left or right joystick button; at the same 
time, the shape can become more or less slanted as the pitch parameter is incremented or 
30 decremented in value, or alternatively a visual representation of the tunnel going up hill or down 
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hill can be provided to visually represent an increase or decrease in pitch. In other examples, to 
change a right/left or stereo balance type of effect, the function or Fx button could be depressed 
to put the system in a mode to change the parameter along with left/right or up/down joystick 
button; such inputs could, for example, result in the sound balance going more towards the right 
channel than the left channel (and be accompanied by a visual representation of the tunnel 
turning to the right, or vice versa for the balance shifting towards the left channel), or the tunnel 
opening becoming larger in width or smaller in width if a wider or narrower stereo effect is 
desired. These are but several examples of how the shape or other visual effect can be modulated 
in correlation to the user input to one or more parameters effecting the sound. What is important 
is that, when the user "tunnels" into a particular instrument lane, various parameters associated 
with the instrument are changeable by the user, with at least certain of the changes in parameter 
being accompanied by a change in the visual representations provided to the user, such as the 
shape, size, color (for color display embodiments) or motions of the displayed visual 
representations. 

While in Underground mode, Player 10 preferably is configured to continue looping with 
the same musical sequence while the user is able to interact with and modify the specific element 
(e.g., the drums) using the joystick and other buttons of Player 10. Also, while down in a lane 
corresponding to a particular component, preferably the left and right buttons of the joystick can 
be used to move from one component parameter to another. Alternatively, side to side joystick 
movements, for example, may enable the user to step through a series of preset characteristics or 
parameters (i.e., with simple joystick type user input, the user may change various parameters of 
the particular component, hear the music effect(s) associated with such parameter changes, and 
determine desirable characteristics for the particular music desired by the user at the particular 
point in time, etc.). In yet another alternative, side to side joystick movements, for example, may 
cause the view to shift from one tunnel to an adjacent tunnel, etc. All such alternatives are within 
the scope of the present invention. 

In addition to other similar variations, the user can mute a particular lane in the I- Way 
mode preferably by use of Stop key (shown in Fig. 2). In this example, while the lane is muted, 
"Muted" can be displayed in the status bar and the round speaker can disappear. Preferably in 
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accordance with such embodiments, the user can un-mute the instrument by again pressing the 
Stop key. 

An additional desirable variation of the user interface preferably involves animating a 
change to the visual appearance, corresponding to a new song part. For example, if in the 
5 Underground mode shown in Fig. 8, or in the I- Way mode shown in Fig. 7, the movement to a 
chorus section is accompanied by a movement through an opening doorway. The graphic 
animation corresponding to a given section of the song (e.g., chorus, intro, bridge, ending, etc.) 
can be used each time that section is played during the song. Examples of transitions are: having 
the user go through a door from a tunnel with one set of visual characteristics, to a tunnel with a 
JO second set of visual characteristics. Another example is to have the user move through a 

transition doorway from a tunnel to a wider tunnel, or even an open area. The preferable feature 
Uj of this aspect of the present invention is to provide an engaging experience for the user by 
J coordinating an animation transition that is closely linked to a musical transition between song 
parts. 

^ 5 Alternatives to the I- Way and Underground concepts can also be advantageously used 

p with the present invention. For example, a user interface that visually depicts the instruments 
":f that are in the current song, and allows the user to select one to go into a tunnel or level where 

O parameters of the particular instrument may be adjusted. In this example, while the music is 

fli 

playing, the user interface provides visual representations of the instruments in the current song, 
20 with the active instruments preferably emitting a visual pulse in time with the music. Fig. 1 3 is 
an example of such a user interface. In accordance with such embodiments, the user can select a 
particular visual picture of an instrument (for example, such as with joystick 15 or function keys 
1 1) and go into that instrument. For example, by selecting the vibrating drumset 25, the user can 
go into another level, such as corresponding to the Underground mode discussed above with 
25 reference to Fig. 1 2, that has each drum shown that is currently being played. Then, the user can 
select and change different aspects of the drums, as well as the sound effects, and drum tracks. If 
the user selected another instrument such as are shown in Fig. 13, they would access a screen that 
allows them to similarly alter the parameters of that particular instrument track. Accordingly, the 
use of alternative themes for the user interface can be advantageously employed with the present 
30 invention, especially a theme where the actual instruments are depicted, as if on a stage. In 
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certain embodiments, both or multiple types of user interfaces are provided, and the user may 
select an I- Way type of user interface, such as shown in Fig. 7, or instrument group or other type 
of interface. What is important is that the user interface in preferred embodiments preferably 
provide an intuitive and easy to use way for users, who may have little experience in creating 
music, to visually appreciate the instruments used to create the music, and then have a visual way 
to access a mode in which parameters and effects associated with particular instruments may be 
modified by the user, which is preferably accompanied by a visual change that corresponds to the 
modified parameters/effects, etc. 

Additionally, in certain preferred embodiments, the use of an external video display 
device (e.g., computer monitor, television, video projector, etc.) is used to display a more 
elaborate visual accompaniment to the music being played. In such cases the I- Way graphical 
display preferably is a more detailed rendition of the I- Way shown in Fig. 7 (e.g., a higher 
resolution image in terms of color depth and/or dots per inch). 

In certain preferred embodiments, pressing Play preferably causes the lane instrument to 
enter Forced mode. This can be implemented to force Player 10 to play this instrument pattern at 
all times until Forced mode is exited by pressing Play again when the lane of that instrument is 
active. In this case, if the instrument was not playing at the time Forced mode is selected, Player 
10 can be configured to automatically compose the instrument pattern and play it starting at the 
end of the current sequence (e.g., 2 bars). In addition, pressing Play for a relatively long period 
(e.g., a second or more) can pause the music, at which time a "paused" message can flash in the 
status line. 

In other preferred embodiments, where such a Forced mode may not be desired (e.g., for 
simplicity, and/or because it may not be needed for a particular type of music), pressing Play 
briefly preferably causes a Pause to occur. Such a pause preferably would have a 'Paused' 
message appear on the Display 20, and preferably can be rhythmically quantized such that it 
begins and ends in musical time with the song (e.g., rhythmically rounded up or down to the 
nearest quarter note). 
Solos 

In Solo mode, all other instruments are muted (except for those that may already be in 
Solo mode) and only this instrument is playing. Solo mode preferably is enabled by entering a 
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tunnel or other level for a particular instrument, and, if the instrument is already playing entering 
Solo mode upon pressing of Play (e.g., the instrument is in Forced play and subsequent pressing 
of Play in Underground mode initiates Solo mode for that instrument; the particular key entry 
into Solo mode being exemplary). An instrument preferably remains soloed when leaving the 
5 corresponding tunnel and going back to the music I- Way. The user also preferably must re-enter 
the corresponding tunnel to exit Solo mode. Also, in certain embodiments multiple levels of 
Solo mode are possible in that you can solo several tracks, one at a time or at the same time, by 
going into different tunnels and enabling Solo mode. In addition, in certain embodiments the user 
preferably can enable/disable Solo mode from the I- Way by, for example, pressing Play for a 
4J) long time (e.g., 2 seconds) while in a lane. Following this example, upon disabling Solo mode, 
lit any lanes that had previously been manually muted (before Solo mode was invoked) preferably 
ly will remain muted. 

Jl Preferably, from a Sample menu different sample parameters can be edited. From the 

pf Samples menu, the user can record, play and change effects on voice, music or sound samples. 
45 This menu also preferably permits the creation and edition of sample lists. The LCD preferably 
y, displays "e. Samples" in the status line and a list of available samples or sample lists in the 
% pl storage media (for example, the SmartMedia card, discussed in connection with Fig. 32) to 
O choose from. 

When playing back a sample, the LCD preferably displays the play sample screen. The 
20 name of the sample preferably scrolls in a banner in the center right part of the LCD while the 
audio output level is indicated by a sizable frame around the name. The status line preferably 
shows the current effect. 

Sample sets or lists preferably are used by the e.DJ, for user songs, as well as MIDI files. 
In the case of MIDI files, preferably a companion PC software program (e.g., a standard MIDI 
25 editing software program such as Cakewalk) is used to enable the user to edit their own MDI 
files (if desired), and use MIDI non-registered parameter numbers (NRPNs are discussed below 
in more detail) to effectuate the playing of samples at a specific timing point. Following this 
example, the companion PC software program can be enabled to allow the user to insert samples 
into the MIDI data, using NRPNs. When a new e.DJ song is created, Player 10 preferably picks 
30 one of the existing sample lists (sample sets preferably being associated with the particular 
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Style/SubStyle of music) and then plays samples in this list at appropriate times (determined by 
an algorithm, preferably based on pseudo random number generation, as hereinafter described) in 
the song. When creating or editing a user song, the user preferably can associate a sample list to 
this user song. Then, samples in this list will be inserted automatically in the song at appropriate 
5 times. Each sample list can be associated with an e.DJ music Style/SubStyle. For instance, a list 
associated with the Techno Style can only be used by a Techno user song or by the e.DJ when 
playing Techno Style. In additional variations, the user preferably can specify specific timing for 
when a particular sample is played in a song, by way of NRPNs discussed below. This 
specification of the timing of a particular sample preferably can be indicated by the user through 
^ the use of a companion PC software program (e.g., a standard MIDI editing software program 
y such as Cakewalk), and/or through a text interface menu on the Player 10 itself 

lis? 

ty New Sample lists preferably are created with a default name (e.g., SampleListOOl). The 

7; list preferably can be renamed in the System-files menu. When the selected item is a sample, the 

current effect preferably is displayed in the status line. When the selected item is a sample list, 
AS "List" preferably is displayed in the status line. 

11 Playback of preferably compressed audio, MIDI, Karaoke, and User songs (e.g., e.DJ 

songs that have been saved) preferably is accessible via the "Songs" mode. Songs can be 
grouped in so-called Play lists to play programs (series) of songs in sequence. The LCD will 
display "e.Songs" in the status line and a list of available songs or Play lists on the SmartMedia 

20 card to choose from. 

Depending on the type of the song (for example, user song, MIDI or WMA), different 
parameters can be edited. The type of the current selection preferably is indicated in the status 
bar: WMA (for WMA compressed audio), MID (for MIDI songs), KAR (for MIDI karaoke 
songs), MAD x (for user songs {x=T for Techno Style, x=H for Hip-Hop, x=K for Cool, etc.}), 

25 and List (for Play lists). 

The name of the song preferably scrolls in a banner in the center right part of the LCD 
while the audio output level is indicated by a sizable frame around the name. If the song is a 
karaoke song, the lyrics preferably are displayed on two (or other number) lines at the bottom of 
the LCD, The animated frame preferably is not displayed. If the song is a user song (i.e., 
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composed by the e.DJ and saved using the Save/Edit button), the music I- Way mode is entered 
instead of the play song mode. 

The edit screen preferably is then displayed, showing two columns; the left column lists 
the editable parameters or objects in the item, the right column shows the current values of these 
parameters. For example, a Play list edit screen preferably will display slot numbers on the left 
side and song names on the right side. The current object preferably is highlighted in reverse 
video. 

Play lists are used to create song programs. New Play lists are preferably created with a 
default name (e.g., PlayListOOl), and preferably can be renamed by the user. When a list is 
selected and played in the song select screen, the first song on the list will begin playing. At the 
end of the song, the next song preferably will start and so on until the end of the list is reached. 
Then, if the terminating instruction in the list is End List, the program preferably stops and Player 
10 returns to the song select screen. If the terminating instruction is Loop List, the first song 
preferably will start again and the program will loop until the user interrupts the song playing, 
such as by pressing the stop button. 

In one embodiment of the present invention, the features of a conventional radio are 
effectively integrated into the user interface of the present invention (see, e.g., the FM receiver 50 
of Fig. 32). For example, when playing a station in Radio mode, the LCD preferably will display 
a radio screen. The LCD preferably will display "Radio" in the status line as well as a list of 
available station presets to choose from. If no preset has been preset, only the currently tuned 
frequency might be displayed. The name of the radio station (or frequency if it is not a stored 
preset) can scroll in a banner in the center right part of the LCD. An animation representing radio 
waves can also be displayed. The status line preferably shows the tuned frequency. In such 
embodiments Player 10 is enabled to operate as a conventional radio device. 

In preferred embodiments, radio-type functionality involves the use of the same type of 
Radio interface, with virtual stations of different Styles. Each virtual station preferably will 
generate continuous musical pieces of one or more of a particular Style or SubStyle. In this 
v.Radio mode, the user can "tune-in" to a station and hear continuous music, without the use of 
an actual radio. Such an arrangement can provide the experience of listening to a variety of 
music, without the burden of hearing advertising, etc., and allows the user to have more control 
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over the Style of music that is played. In such embodiments, a user will enter v.Radio mode and 
be presented with a list of v.Radio stations, each preferably playing a particular Style or SubStyle 
of music. The user then preferably "tunes" to a v.Radio channel by selecting a channel and 
pressing play, for example (see, e.g., Fig. 10), which causes Player 10 to begin auto-composing 
and playing songs in accordance with the particular v.Radio channel In certain embodiments, the 
v.Radio may be controlled to play user songs of the particular Style or SubStyle associated with 
the particular v.Radio channel, which may be intermixed with auto-composed songs of the 
particular type of SubStyle. In yet other embodiments, one or more v.Radio channels may be 
provided that play songs of more than a single Style or SubStyle, which also may be intermixed 
with user songs of various Styles or SubStyles. With such embodiments, the user is provided 
options to select the particular type of v.Radio channel that Player 10 "tunes" in. Additionally, in 
certain embodiments the v.Radio mode preferably can be used to play a variety of different song 
formats (e.g., MP3, WAV, WMA, eDJ, etc.). 

In accordance with certain embodiments, another variation of the Radio feature integrates 
some aspects of the v.Radio with other aspects of the Radio. As one example, a user could listen 
to a Radio station, and when a commercial break comes on, Player 10 switches to the v.Radio. 
Then, when the real music comes back on, the device can switch back to a Radio. Another 
integration is to have news information from the Radio come in between v.Radio music, 
according to selectable intervals. For example, most public radio stations in the USA have news, 
weather, and traffic information every ten minutes during mornings and afternoons. The v.Radio 
can be configured to operate as a virtual radio, and at the properly selected interval, switch to a 
public station to play the news. Then it can switch back to the v.Radio mode. These variations 
provide the capability for a new listening experience, in that the user can have more control over 
the radio, yet still be passively listening. It is considered that such an arrangement would have 
substantial use for commercial applications, as discussed elsewhere in this disclosure. 

Special functions can preferably be accessed from the System menu. These functions 
preferably include: file management on the SmartMedia card (rename, delete, copy, list, change 
attributes) (the use of such SmartMedia or other Flash/memory/hard disk type of storage medium 
is discussed, for example, in connection with Fig. 32), Player configuration (auto-play, power off, 
delay, keypad auto-repeat, language, etc.), firmware upgrade, SmartMedia card formatting, 
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microphone settings, and equalizer user presets. The Player can preferably modify various 
attributes of a file stored on the SmartMedia card. As a precaution, by default, all system files 
preferably can be set as read only. 

In certain embodiments a User Configuration interface preferably enables the user to enter 
a name to be stored with the song data on the removable memory storage (e.g., SMC), and/or to 
enable the user to define custom equalization settings, and/or sound effects. As an example of 
EQ settings, it is preferable to enable the user to select from a group of factory preset equalizer 
settings, such as flat (e.g., no EQ effect), standard (e.g., slight boost of lower and higher 
frequencies), woof (e.g., bass frequency boost), and hitech (e.g., high frequency boost). In 
addition to such preset EQ settings, it is preferable to enable the user to define their own desired 
settings for the EQ (as an example, a 4 band EQ with the ability to adjust each of the 4 bands by 
way of the joystick). Additionally, in certain embodiments it is preferable to enable the user to 
similarly customize sound effects to be used for particular samples. Following this example, in 
addition to a set of standard factory preset sound effects such as Lowvoice (e.g., plays the song 
with a slower speed and lower pitch to enable the user to sing along with a lower voice), reverb, 
Highvoice (e.g., plays the song with a faster speed and higher pitch), Doppler (e.g., varying the 
sound from Highvoice to Lowvoice), and Wobbler (e.g., simulating several voices with effects), 
it is preferable to make a customized effect capability available to the user that can incorporate 
various combinations of standard effects, and in varying levels and/or with varying parameter 
values. 

When the user saves a song that is being played in e-DJ mode, the song is preferably 
created with a default name (e.g. TECHNO001). The song can preferably be renamed in the 
System-files menu. When entering the Files menu, files present on the SmartMedia card and the 
free memory size are preferably listed in an edit screen format. The status line preferably 
indicates the number of files and the amount of used memory. The file management menu 
preferably offers a choice of actions to perform on the selected file; delete, rename, copy, change 
attributes, etc. The name of the current file preferably is displayed in the status line. 
Additionally, in certain embodiments it is preferable to enable the use of System parameter files 
that contain, for example, settings for radio presets (e.g., radio station names and frequencies), 
settings for certain parameters (e.g., pitch, tempo, volume, reverb, etc.) associated with music 
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files such as WAV, WMA, MP3 ? MIDI, Karaoke, etc. In these embodiments it is preferable for 

the parameter setting to apply to the entire file. 

When entering the Configuration menu, an edit screen preferably is displayed showing 

various configurable parameters. Fig. 14 describes some of the parameters that are preferably 
5 configurable by the Configuration menu, along with possible values. When modifying a selected 

character in a file name, Forward preferably can be used to insert a character after the highlighted 

one, and Backward preferably to delete the highlighted character. To save the edits and go back 

to file menu, Play preferably can be used. 

When selecting copy, a screen proposing a name for the destination file in a large font 
J.0 preferably is displayed. This name preferably is proposed automatically based on the type of the 

source file. For instance if the source file is a Hiphop user song, the proposed name for the 
isj destination file could be HIPHOP001 (alternatively, the user preferably can use the rename 
jS procedure described above to enter the name of the destination file). 
j,jf The Firmware Upgrade menu preferably permits the upgrade of the Player firmware 

AS (embedded software) from a file stored on the SmartMedia card. Preferably, it is not possible to 
j£[ enter the Upgrade firmware menu if no firmware file is available on the SmartMedia card. In this 
k{ case a warning message is displayed and the Player preferably returns to Systems menu. In 
0 additional embodiments, the use of a bootstrap program preferably is enabled to allow the 

firmware to be updated from a removable memory location (e.g., SMC). Such a bootstrap 
20 program preferably can alternatively be used for upgrading the DSP 42 soundbank located in 

Flash 49. 

The Player function keys, identified in Fig. 2, preferably are comprised of the standard 
buttons found on CD-players or VCRs, and are used to control the playback of songs (e.g.; 
Player-proprietary, MIDI, WMA, MP3, etc). The Record key controls recording (e.g.; samples). 

25 When used in editing or selection menus the player keys also have the following actions: Play 
preferably is used to select a sub menu or validate a change, Stop preferably is used to go back to 
previous menu, cancel an action or discard a change, Forward preferably is used to insert an item 
in a list, and REVERSE preferably is used to delete an item in a list. This is one example of how 
to use a minimum of keys in a device, while retaining a relatively large set of features, while also 

30 keeping the user interface relatively intuitive for a variety of users. 
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When a list is selected in the song select screen, pressing Play preferably will start playing 
the first song in the list. While the sample lane is selected, Play preferably can be configured to 
start playing the selected sample. While in an instrument lane, Play preferably can be configured 
to enter solo mode for the current instrument, or Forced mode. 

To create a song/sample list, Forward preferably can be used while in the song or sample 
select screen. 

To leave an edit screen, Stop preferably can be used to discard the edits and exit. For 
example, in the sample selection screen press Stop to go back to the Home screen. Additionally, 
for any given instrument during playback, Stop preferably can be used as a toggle to 
mute/unmute the instrument. 

Record preferably can be pressed once to start recording a sample (recording samples 
preferably is possible in almost any operating mode of the Player). Record preferably can be 
pressed again to end the recording (recording preferably is stopped automatically if the size of the 
stored sample file exceeds a set size, such as 500Kbytes). The record source preferably is chosen 
automatically depending on the operating mode. If no music is playing, the record source 
preferably is the active microphone (local or docking station). If music is playing songs, e.DJ or 
radio, the record source preferably is a mix of the music and the microphone input if not muted. 
Further, it is possible to use different sample recording formats that together provide a range of 
size/performance options. For example, very high quality sample encoding format may take 
more space on the storage medium, while a relatively low quality encoding format may take less 
space. Also, different formats may be more suited for a particular musical Style, etc. 

In v-Radio mode, to listen to the selected station, Play preferably can be used. Press Play 
to mute the radio. Press Stop to go back to station preset selection screen. To launch an 
automatic search of the next station up the band, press Forward until the search starts. To launch 
an automatic search of the next station down the band, press Backward until the search starts. 
Press Forward/Backward briefly to fine-tune up/down by 50kHz steps. 

In eDJ Mode, while in Sample lane, Play preferably can be pressed to play a selected 
sample. If sample playback had previously been disabled, the first press on Play preferably will 
re-enable it. Subsequent presses preferably will play the selected sample. If a sample if playing, 
Stop preferably will stop it. If no sample is playing, pressing Stop preferably will mute the 
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samples (i.e. disable the automatic playback of samples by the e-DJ when returning to I- Way 
mode). When muted, "Muted" preferably is displayed in the status bar and the round speaker 
preferably disappears on the I- Way sample lane. 

In Song mode, to start the playback of selected song or Play list, preferably press Play and 
the LCD will preferably display the play song screen. In Song mode, Stop preferably can be 
pressed to stop the music and preferably go back to song selection screen. Preferably press 
Forward briefly to go to next song (if playing a Play list, this preferably will go to the next song 
in the list; otherwise, this preferably will go to the next song on the SmartMedia). Preferably 
press Forward continuously to fast forward the song. Preferably press Backward briefly to go to 
the beginning of the song and a second press preferably takes you to the previous song (if playing 
a Play list, this preferably will go to the previous song in the list; otherwise, this preferably will 
go to the previous song on the SmartMedia). Preferably press Backward continuously to quickly 
go backward in the song. 

Pressing Stop can be a way to toggle the muting of an instrument/lane. For example, 
when on a Drums lane, pressing Stop briefly preferably can mute the drums, and pressing it again 
briefly preferably can un-mute the drums. Additionally, pressing Stop for relatively long period 
(e.g., a second or so) preferably can be configured to stop the music and go back to Style 
selection screen. 

Forward preferably can be configured to start a new song. Backward preferably can be 
used to restart the current song. 

Forward or Backward preferably can be used to keep the same pattern but change the 
instrument playing (preferably only "compatible" instruments will be picked and played by the 
Player). 

Preferably press Stop to mute microphone. Preferably press Play to un-mute the 
microphone. 

To start the playback of the selected sample, preferably press Play. Preferably press Stop 
to stop the sample and go back to sample selection screen. 

In Song mode, preferably press Play to pause the music. Preferably press Play again to 
resume playback. Pressing Forward key in the song select screen preferably will create a new 
Play list. In the song selection screen, preferably press Stop to go back to the Home screen. 



28 



In the Style selection screen preferably press Stop to go back to the Home screen. 

To enter the file management menu for the highlighted file, preferably press Play. 

While browsing the file management list, preferably press Forward to scroll down to next 
page. Press Backward preferably to scroll up to previous page. 

In the file management menu, to start a selected action, preferably press Play, 

When selecting Delete, preferably a confirmation screen is displayed. 

When selecting Rename, preferably a screen showing the name of the file in big font is 
displayed and the first character is preferably selected and blinking. 

When copying a file, preferably press Play to validate the copy. If a file of the same type 
as the source file exists with the same name, preferably a confirmation screen asks if the file 
should be overwritten. Select YES or No and preferably press Play to validate. Press Stop to 
abort the copy and preferably return to file menu. It is a preferable feature of this embodiment to 
allow files to be copied from one removable memory storage location (e.g., SMC) to another by 
use of MP 36 RAM. In this example, it is a desirable to enable the copying of individual song or 
system files from one SMC to another without using a companion PC software program, 
however, in the case where an entire removable memory storage volume (e.g., all the contents of 
a particular SMC) is to be copied, it is desirable to use a companion PC software program to 
allow larger groups of data to be temporarily buffered (using the PC resources) by way of the 
USB connection to the PC. Such a feature may not be possible in certain embodiments without 
the PC system (e.g., using the MP 36 internal RAM) because it likely would involve the user 
repeatedly swapping the SMC target and source volumes. 

The e-DJ, v-Radio, Songs, Samples and System direct access keys detailed in Fig. 3 
preferably permit the user to directly enter the desired mode from within any other mode. These 
keys preferably can also be used to stop any mode, including the current mode. This can be 
faster than the Stop key, because in some cases, such as while in eDJ Mode inside a lane, the 
Stop key preferably may be used to mute the lane, rather than stop the eDJ Mode. 

The audio output control is identified in Fig. 1 as Vol. Up/Down. Audio output control 
keys preferably are also used to control the microphone input when used in combination with 
prefix keys. 
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The Up/Down/Left/Right keys preferably comprise a joystick that can be used for: menu 
navigation, song or music Style selection, and real time interaction with playing music. 
Additionally, Up/Down preferably can be used for moving between modes such as the 
Underground & I- Way modes in an intuitive manner. 

When editing a list, objects preferably can be inserted or deleted by pressing Forward to 
insert an object after the highlighted one or pressing Backward to delete the highlighted object. 

To browse the list or select parameters, preferably use Up/Down. To edit the highlighted 
object preferably press Right. Press Left preferably to go directly to first item in the list. 

In instrument tunnels (i.e.; Drums, Bass, Riff and Lead), Right preferably can be 
configured to compose a new music pattern. Similarly, Left preferably can be used to return to 
previous patterns (see note below on music patterns). The new pattern preferably will be 
synchronized with the music and can start playing at the end of the current music sequence (e.g., 
2 bars). In the mean time, preferably a "Composing. . ." message can be configured to appear on 
the status line. Additionally, Down preferably can be used to compose a new music pattern 
without incrementing the pattern number. This preferably has the same effect as Right (compose 
and play another pattern), except that the pattern number preferably won't be incremented. 

One benefit of these composition features is that they enable the user to change between 
patterns during a live performance. As can be appreciated, another reason for implementing this 
feature is that the user preferably can assemble a series of patterns that can be easily alternated. 
After pressing Right only to find that the newly composed pattern is not as desirable as the 
others, the user preferably can subsequently select Down to discard that pattern and compose 
another. Upon discovering a pattern that is desirable, the user preferably can thereafter use Right 
and Left to go back and forth between the desirable patterns. Additionally, this feature preferably 
allows the system to make optimum use of available memory for saving patterns. By allowing 
the user to discard patterns that are less desirable, the available resources preferably can be used 
to store more desirable patterns. 

In the file management menu, to select a desired action, preferably use Up/Down. When 
renaming files, the user preferably can use Left/Right to select the character to be modified, and 
Up/Down to modify the selected character. Pressing Right when the last character is selected 
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preferably will append a new character. The user preferably can also use the Forward/Backward 
player function keys at these times to insert/delete characters. 

In the microphone tunnel, Left/Right preferably can be configured to change microphone 
input left/right balance. In the sample tunnel, Left/Right preferably can be used to select a 
5 sample. Pressing Forward in the sample select screen preferably will create a new sample list. 

Down is an example of an intuitive way to enter the Underground mode for the current I- 
Way mode lane. In this mode, the user preferably can change the pattern played by the selected 
instrument (drums, bass, riff or lead) and preferably apply digital effects to it. Similarly, Up 
preferably can be configured to go back to music I- Way from the Underground mode, 
dp In v-Radio mode, to select the desired station preset, preferably use Up/Down. Preferably 

*:( use Up/Down to go to previous/next station in the preset list and preferably press Save/Edit while 
W a station is playing to store it in the preset list. 

JL: The Save/Edit key preferably can be used to save the current song as a User song that can 

III be played back later. Such a song preferably could be saved to a secondary memory location, 
45 such as the SmartMedia card. In the case of certain Player embodiments, this preferably can be 
| B * done at any time while the e-DJ song is playing, as only the "seeds" that generated the song 
*g preferably are stored in order to be able to re-generate the same song when played back as a User 
y song. In certain embodiments it is preferable to incorporate a save routine that automatically 

saves revised files as a new file (e.g., with the same name but a different suffix). Such a feature 
20 can be used to automatically keep earlier versions of a file. 

While the use of seeds is discussed elsewhere in this disclosure, it may be helpful at this 
point to make an analogy on the use of the Save/Edit 17 key. This key is used to save the basic 
parameters of the song in a very compact manner, similar to the way a DNA sequence contains 
the parameters of a living organism. The seeds occupy very little space compared to the 
25 information in a completed song, but they are determinative of the final song. Given the same set 
of saved seeds, the Player algorithm of the present invention preferably can generate the exact 
same sequence of music. So, while the actual music preferably is not stored in this example 
(upon the use of the Save/Edit 17 key), the fundamental building blocks of the music is stored 
very efficiently. The desirability of such an approach can be appreciated in a system with 
30 relatively limited resources, such as a system with a relatively low-cost/low performance 
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processor and limited memory. The desirability of such a repeatable, yet extremely compact 
method of storing music can also be contemplated in certain alternative embodiments, such as 
those involving the communication with other systems over a relatively narrow band 
transmission medium, such as a 56kbps modem link to the internet, or an iRDA/bluetooth type of 
link to another device. Clearly this feature can be advantageously employed using other 
relatively low bandwidth connections between systems as well. Additionally, this feature allows 
the user to store many more data files (e.g., songs) in a given amount of storage, and among other 
advantages, this efficiency enhances other preferable features, such as the automatic saving of 
revised files as new files (as discussed above). 

In certain embodiments, it is desirable to check the resources available to a removable 
memory interface (e.g., the SMC interface associated with SMC 40) to safeguard the user song in 
instances where a removable memory volume is not inserted, and/or there is not enough available 
storage on an inserted removable memory volume. In these cases, when the user saves a song 
(e.g., pushes the Save/Edit key 17 button) it is advantageous to prompt the user to insert an 
additional removable memory volume. 

The name of the song preferably can be temporarily displayed in the status line, in order 
to be able to select this song (as a file) later on for playback. Of course the song file name 
preferably can be changed later on if the User wishes to do so. Once an item has been created, it 
preferably can be edited by selecting it in the song or sample selection screens and pressing 
Save/Edit. Pressing Save/Edit again will preferably save the edited item and exit. When the 
On/Off key is pressed for more than 2 seconds, the Player preferably can be configured to turn on 
or off, yet when this combination is pressed only briefly, the On/Off key can alternatively 
preferably be configured to turn the LCD backlight on or off. 

When Pitch/Tempo is pressed simultaneously with Left or Right, it preferably can be used 
as a combination to control the tempo of the music. When Pitch/Tempo is pressed 
simultaneously with Up/Down, it preferably can control the pitch of the microphone input, the 
music, etc. 

When Effects/Filters is pressed simultaneously with Left/Right or Up/Down, it preferably 
can control the effect (for example, cutoff frequency or resonance) and/or volume (perhaps 
including mute) applied on a given instrument, microphone input, or sample. 



32 



As will be appreciated by one of ordinary skill in the art, other related combinations can 
be employed along these lines to provide additional features without detracting from the usability 
of the device, and without departing from the spirit and scope of the present invention. 

Various examples of preferred embodiments for the structuring of a song of the present 
5 invention will now be described. Preferably for a new song, the only user input needs to be an 
input Style. Preferably even this is not required when an auto-play feature is enabled that causes 
the Style itself to be pseudo-randomly selected. But assuming the user would like to select a 
particular Style, that is the only input preferably needed for the present embodiment to begin 
song generation. 

sip Before moving into the actual generation process itself, it is important to note that 

O preferably implicit in the user's Style selection can be a Style and a SubStyle. That is, in certain 
y embodiments of the present invention, a Style is a category made up of similar SubStyles. In 
J these cases, when the user selects a Style, the present embodiment will preferably pseudo- 
nj randomly select from an assortment of SubStyles. Additionally, it is preferably possible for the 
J 5 user to select the specific SubStyle instead, for greater control. In these particular embodiments, 
J preferably whether the user selects a Style or a SubStyle, the result preferably is that both a Style 
U and a SubStyle can be used as inputs to the song generation routines. When the user selects a 
|l| SubStyle, the Style preferably is implicitly available. When the user selects a Style, the SubStyle 
preferably is pseudo-randomly selected. In these cases, both parameters are available to be used 
20 during the song generation process to allow additional variations in the final song. 

As shown in Fig. 15, the Song is preferably comprised of a series of Parts. Each part 
preferably might be an intro, theme, chorus, bridge, ending, etc.; and different parts preferably 
can be repeated or returned to later in a song. For example, one series of parts might be: intro, 
theme, chorus, theme, chorus, theme, chorus, end. Certain Styles preferably may have special 
25 types of parts, and other Styles preferably may only use a subset of the available parts. This 
depends on the desired characteristics for a particular Style or SubStyle. For example, a 'cool' 
Style may not use a bridge part. Additionally, certain Styles that have a generally faster tempo 
preferably can use a virtually-doubled part size by simply doubling each part (i.e., intro, theme, 
theme, chorus, chorus, theme, theme, chorus, chorus, etc.). 
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Also, in certain cases, the user experience preferably may benefit from having the display 
updated for a particular Part. For example, an indication of the current position within the 
overall length of the song may be helpful to a user. Another example is to alert the user during 
the ending part that the song is about to end. Such an alert preferably might involve flashing a 
5 message (i.e., 'Ending') on some part of the display, and preferably will remind the user that they 
need to save the song now if they want it saved. 

Another optimization at this level is preferably to allow changes made by the user during 
the interactive generation of a song to be saved on a part-by-part basis. This would allow the 
user to make a change to an instrument type, effect, volume, or filter, etc., and have that revised 
is 10 characteristic preferably be used every time that part is used. As an example, this would mean 

O that once a user made some change(s) to a chorus, every subsequent occurrence of the chorus 

O 

m would contain that modified characteristic. Following this particular example, the other parts of 

~i% 

the song would contain a default characteristic. Alternatively, the characteristic modifications 
111 preferably could either be applied to multiple parts, or preferably be saved in real time 
15 throughout the length of the song, as discussed further below. 

r* Each Part preferably can be a different length, and preferably can be comprised of a series 

O of SubParts. One aspect of a preferred embodiment involves the SubPart level disclosed in Fig. 

Q 15, but the use of the SubPart level is optional, in that the Part structure can be comprised 
directly by Sequences without the intervening SubPart level. 

20 In certain embodiments, where a SubPart layer is implemented, each SubPart preferably 

can be of a different size. Such an approach can enhance the feel of the resulting musical 
composition, as it affords a degree of variety to the Parts. 

Each SubPart preferably is comprised of a series of Sequences (SEQs). In keeping with 
the previous comment regarding the relationship between consistent sizing and flexibility of rule 

25 applications, each SEQ preferably can be the same length and time signature. In the example of 
Fig. 15, each SEQ is two bars long with a 4/4 time signature. Of course, these can be adjusted in 
certain variations of the invention, but in this example, this arrangement works well, because it 
allows us to illustrate how we can hold notes across a measure boundary. Typically, it might be 
advantageous to lengthen the size of the SEQs (as well as the RPs to be discussed hereinafter) to 
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allow greater diversity in the musical outcome. Such a variation is certainly within the scope of 
the present discussion, as well as Fig. 15. 

Following the example of Fig. 15, each SEQ preferably consist of multiple Real Patterns 
(RPs) in parallel. Generally, it is useful to have 1 RP for each type of instrument. In this case, a 
type of instrument preferably corresponds to a single lane of the I- Way user interface (i.e., drums, 
bass, riff, etc.). RP data preferably is actual note data; generally, information at this level 
preferably would not be transposed unless through user interaction, and even then such 
interaction preferably would likely apply to multiple instruments. Of course this is a user 
interface decision, and is not a limitation to the embodiments discussed here. 

In this case, the multiple RPs preferably are merged together to comprise the SEQ. As 
will be recognized by those skilled in the art, this is analogous to the way a state-of-the-art MIDI 
sequencer merges multiple sets of MIDI Type 1 information into MIDI Type 0 file. 

Further background detail on this can be found in the "General MIDI Level 2 
Specification" (available from the MIDI Manufacturer's Association) which is hereby 
incorporated by reference. 

One reason for allowing multiple RPs in parallel to define a SEQ, is that at certain times, 
certain lanes on the I-Way may benefit from the use of multiple RPs. This is because it may be 
desirable to vary the characteristics of a particular piece of the music at different times during a 
song. For example, the lead preferably may be different during the chorus and the solo. In this 
case it maybe desirable to vary the instrument type, group, filtering, reverb, volume, etc., and 
such variations can be enacted through the use of multiple RPs. Additionally, this method can be 
used to add/remove instruments in the course of play. Of course, this is not the only way to 
implement such variations, and it is not the only use for multiple RPs. 

Following the example of Fig. 15, each RP preferably is comprised of two bars, labeled 
RPx and RPy. Such a two bar structure is useful because it preferably allows some variations in 
MIDI information (chord changes, sustain, etc.) across the internal bar boundary. Such variation 
can provide the effect of musical variation without adding the complexity of having chordal 
changes occur inside a bar, or having notes sustained among multiple RPs. 

Generally, it is cumbersome to allow notes to be held over multiple RPs. This is partly 
because of the characteristics of MIDI, in that to hold a note you need to mask out the Note Off 
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command at the end of a pattern, and then mask out the Note On command at the beginning of 
the next pattern. Also, maintaining the same note across pattern boundaries is a concern when 
you switch chords, because the end of a pattern preferably is an opportunity to cycle through the 
chord progression, and you need to make sure that the old note being sustained is compatible 
5 with the new chord. The generation and merging of chord progression information preferably 
occurs in parallel with the activities of the present discussion, and shall be discussed below in 
more detail. While is considered undesirable to hold notes across patterns, there are exceptions. 

One example of a potentially useful time to have open notes across multiple patterns is 
during Techno Styles when a long MIDI event is filtered over several patterns, herein called a 
10 'pad'. One way to handle this example, is to use a pad sequence indicator flag to check if the 
p current SEQ is the beginning, in the middle, or the end of a pad. Then the MIDI events in the 
^ pad track can be modified accordingly so that there will be no MIDI Note Offs for a pad at the 
*«j beginning, no MIDI Note Ons at the beginning of subsequent RPs, and the proper MIDI Note 
ry Offs at the end. 

p Continuing our discussion of Fig. 15, RPs preferably are comprised of Virtual Patterns 

w (VPs) that have had musical rules applied to them. Musical rales are part of the generation and 
p merging of chord progression information that will be discussed in more detail below. A VP can 
q be generally thought of as the rhythm of a corresponding RP, along with some general pitch 
1 y information. Preferably, musical rules are applied to the VP, and the result is the RP. Musical 
20 rules are discussed in more detail below. 

A VP preferably can be considered as a series of Blocks. In the example of Fig. 15, each 
Block has two dimensions: Blockd and Blockfx, but this is but one possible variation. In this 
example, Blockd corresponds to the data of the block, and Blockfx corresponds to effects that are 
applied to the data (i.e., volume, filtering, etc.). In this example, the Blockd information can be 
25 thought of as individual rhythmic pattern information blocks selected from a variety of possible 
rhythmic blocks (certain desirable approaches to create such a variety of possible rhythmic 
blocks, and the corresponding selection thereof in creating a VP, is discussed in greater detail 
later in this disclosure, with reference to Figs. 22 and 23). 

The Blockfx dimension described in Fig. 15 is an optional way to add certain preferably 
30 characteristics to the Blockd information. For example, in addition to volume or filtering 
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information mentioned above, the Blockd dimension preferably can be used for allocation or 
distribution of musical information predictors, discussed in more detail below as Virtual 
Note/Controller (VNC) information. However, the Blockfx dimension is optional, and the 
Blockd information can be processed independently of such volume or filtering information, to 
great success. 

Assuming the example presented earlier wherein the time signature is 4/4 and the RP is 
two bars, all Blocks in a pattern preferably must add up to 8 quarter notes in duration. In this 
example, assuming n Blocks in a particular RP, the duration in quarter notes of each Block in the 
corresponding VP would be between 1 and (8-{n-l». While this example describes 4/4 time 
with a quarter note being the basic unit of length for a Block, simple variations to this example 
preferably would include alternate time signatures, and alternate basic units for the Block (i.e., 
13/16 time signature and 32 nd note, respectively, etc.). 

Getting at the bottom of Fig. 15 we see an optional implementation of SubBlocks (SBs). 
Such an implementation could preferably be used, for example, for the drum lane of the I- Way 
during certain Styles, where it might be desirable to have separate SBs for the bass drum, cymbal, 
snare, etc. A further optimization of this implementation of the present embodiment would be to 
have the SB level of the drum lane preferably comprise directly the VP of the drum lane. Such 
an arrangement preferably would effectively remove the complexity of having a separate Blockfx 
for each individual SB of the drum lane. An example of where such an optimization might be 
useful when implementing the present invention is in an environment with limited resources, or 
an environment where having separate effects for separate parts of the drums (snare, bass drum, 
etc.) is not otherwise desirable. 

Additionally, in some applications of the present invention, it may be desirable to enable 
certain levels in Fig. 15 to be bypassed. In such cases, this would preferably allow a user to input 
real pattern data in the form of actual note events (e.g., in real time during a song via a MIDI 
instrument as an input). Further, with the use of a companion PC software application (and a 
connection to the PC), in certain embodiments it is preferable to allow users to input their own 
MIDI patterns for use as Block data. 

Various examples of preferred embodiments of the Music Rules used in the creation of a 
Song of the present invention will now be described. 
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Fig. 16 is a flow diagram depicting a general overview of a preferred approach to 
generating music in the context of the present invention. Starting at step 1, a style of music and a 
selected instrument are defined or loaded. Once the style of music and the type of instrument are 
known, the algorithm can apply Block rules to develop individual virtual pattern sub-blocks (e.g., 
5 those shown in Fig. 22). In certain alternative embodiments, the individual virtual pattern sub- 
blocks preferably are selected from a list or other data structure. Once the sub-blocks are 
available (e.g., from a list or from a block rule algorithm) they are processed into a Virtual 
Pattern (VP) at step 2. At this point in this example, a VP preferably is not music, although it 
does contain rhythmic information, and certain other embedded musical characteristics. At step 
|L0 3, using the embedded musical characteristics of the VP data structure, musical rules preferably 
«| are applied to the VP to add more musicality to the pattern, and the result preferably contains 
W both the rhythmic information of the VP, as well as actual musical information. At step 4 a tonic 
j= is preferably applied to the output from step 3, in that each measure preferably is musically 
pj transposed according to a tonic algorithm to impart a chordal progression to the data structures. 
45 Then at step 5, a mode preferably is applied that makes subtle changes to the musical information 
L.i to output music information preferably set to a particular musical mode. Then, at step 6, a key 
preferably is applied to the data structure to allow key changes, and/or key consistency among 
Q various song components. Finally, at step 7, a global pitch adjustment preferably can be applied 
to the data structure, along with the rest of the song components, to allow real time pitch/tempo 
20 shifting during song play. 

This process of applying various musical rules to generate a RP preferably can be a part 
of the overall song generation process mentioned above in connection with Fig. 15. Before going 
through the steps described in Fig. 16 in more detail, a discussion of the embedded characteristics 
mentioned above, as well as some mention of tonic and key theory will be helpful. 
25 Bearing in mind that the MIDI Specification offers a concise way to digitally represent 

music, and that one significant destination of the output data from the presently discussed 
musical rules is the MIDI digital signal processor, we have found it advantageous to use a data 
format that has some similarities with the MIDI language. In the discussion that follows, we go 
through the steps of Fig. 16 in detail, with some examples of the data that can be used at each 
30 step. While the described data format is similar to MIDI, it is important to understand the 
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differences. Basically, the present discussion describes how we embed additional context- 
specific meaning in an otherwise MIDI compliant data stream. During processing at each of the 
steps in Fig. 16, elements of this embedded meaning preferably is extracted, and the stream 
preferably is modified in some musical way accordingly. Thus, one way to consider this process 
5 is that at each step, our stream becomes closer to the actual MIDI stream that is played by the 
MIDI DSP (this aspect is addressed in more detail below with reference to Fig. 21). 

In the present example it is considered advantageous to break down the rhythmic and 
musical information involved in the music into Virtual Notes and/or Controllers (VNC). In the 
example of Fig. 17, we have provided several examples of VNCs that we have found to be 
iJp useful. Basically, these VNCs represent our way of breaking down the musical rules of a 
jjj particular genre into simplified mechanisms that can be used by an algorithm preferably along 
|y with a certain random aspect to generate new music that mimic the characteristics and variety of 
7; other original music in the genre. Depending on the Style of music, different types of VNCs will 
Vz be useful. The list in Fig. 17 is simply to provide a few examples that will be discussed later in 
45 more detail. 

| k l In an important feature of this aspect of the present invention is that we have embedded 

% control information for the music generation algorithm into the basic blocks of rhythmic data 
drawn upon by the algorithm. We have done this in a preferably very efficient manner that 
allows variety, upgradeability, and complexity in both the algorithm and the final musical output. 

20 A key aspect of this is that we preferably use a MIDI-type format to represent the basic blocks of 
rhythmic data, thus enabling duration, volume, timing, etc. Furthermore, we preferably can use 
the otherwise moot portions of the MIDI-type format of these basic blocks to embed the VNC 
data that informs the algorithm how to go about creating a part of the music. As an example, we 
preferably can use the pitch of each MIDI-type event in these basic sub-blocks of rhythmic data 

25 to indicate to the algorithm what VNC to invoke in association with that MIDI-type event. Thus, 
as this rhythmic data is accessed by the algorithm, the pitch-type data preferably is recognized as 
a particular VNC, and replaced by actual pitch information corresponding to the VNC function. 
Fig. 17 shows, in the first column, examples of such embedded values, and in the second and 
third columns, examples of recognized VNC nomenclature, and potential pitch information 

30 associated therewith. 
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In the example of Fig. 17, the fundamental type of VNC preferably is the Base Note. This 
can be considered in certain musical styles as the cornerstone of the melody, except, for example, 
when these notes are relatively short notes in a run. This is why rhythm exists in a VP to provide 
context to the VNCs. Example values of the Base Note are C,E,G or B. Which value is finally 
used preferably depends on a pseudo-random seed as part of an algorithm. We find that in these 
examples, these values provide pretty good music for the genres we have studied so far. The 
Magic Notes preferably can have the values indicated in Fig. 17 (assuming a diatonic scale is 
used), and these values are preferably relative to the preceding Base Note. Unlike a Base Note, 
Magic Notes preferably are useful at providing a note that does not strongly impact the melody. 
For example, the algorithm will see that the next note to be generated is a Magic Note 1, and it 
will use the Pseudo Random Number Seed to predictably select one of the possible values: +1, - 
1, +2, -2. The predictably-selected value preferably will be used to mathematically adjust the 
value from the preceding Base Note to preferably result in a note value. Following this example, 
if the preceding Base Note was a C2, and the result of the algorithm is to select a +1, then the 
Magic Note value is a D2. Note that preferably the only difference between Magic Note 0 and 1 
is that Magic Note 0 can have a value of 0. Thus, the use of Magic Note 0 will occasionally 
result in a note that is the same value as the preceding Base Note. This is an example of a way to 
influence the sound of a particular Style in relatively subtle ways. 

In the discussion above, by 'predictably-selected' we refer to the process of pseudo- 
randomly selecting a result based on a seed value. If the seed value is the same, then the result 
preferably will be the same. This is one way (though not the only way) to enable reproducibility. 
Further discussion of these pseudo random and seed issues is provided elsewhere in the present 
specification. 

Continuing with Fig. 17, a High Note preferably simply adds an octave to the preceding 
Base Note, and is useful to make a big change in the melody. What is interesting here is that 
multiple VNCs preferably can occur in between the previous Base Note and the High Note, and 
this is a way to allow a musical phrase run to a tonic note, corresponding to an earlier Base Note. 
Obviously, this VNC is very useful, as it again preferably enables the structure of music to exist 
before the actual music itself is written. The algorithm preferably does not know what the final 
key, or mode will be at this point, but the octave and tonic preferably are available. 
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Similar to the Magic Note, the Harmonic Note VNC preferably allows the algorithm to 
pseudo-randomly select a harmonic from a set of possible harmonics. This capability is useful 
when there are multiple notes sounding at the same time in a chord. When this VNC is used, it 
preferably can result in any of the relative harmonics described in Fig. 17. These values are only 
examples of possible values, and ones that we find particularly useful for the types of music we 
have addressed. 

Last Note is a VNC that is very similar to the Base Note, except that it preferably only 
contains a subset of the possible values. This is because, as we understand musical phrasing for 
the types of music we address, the final note preferably is particularly important, and generally 
sounds best when it has a relative value of C or G (bearing in mind that in this example, all the 
notes preferably can subsequently be transposed up or down through additional steps). As with 
all the VNCs, the precise note that might be played for this value preferably depends on the 
Mode and Key applied subsequently, as well as general pitch shifting available to the user. 
However, in the music we address, we find this to be a useful way to add subtlety to the music, 
that provides a variety of possible outcomes. 

One Before Last Note is a VNC that preferably immediately precedes the Last Note. 
Again, this is because we have found that the last two notes, and the harmonic interval between 
them, are important to the final effect of a piece, and accordingly, we find it advantageous with 
the Final Notes of C and G to use One Before Last Notes of E, G, or B. These values can be 
adapted for other Styles of music, and only represent an example of how the VNC structure can 
be effectively utilized. 

The last example VNC in Fig. 17 is the ALC controller. This is one example of how 
certain musical non-pitch concepts can preferably be employed using a MIDI controller. In this 
example, the ALC controller can be thought of as a prefix which modifies the meaning of 
immediately following notes. The ALC controller can be used to indicate that the next note is to 
be treated in a special manner, for example, to setup a chord. In this example, you can use a 
particular predefined value for the ALC controller to precede a sequence of a fixed note with 
additional harmonic notes. Similar to the Magic Note VNC discussed above, the Harmonic Notes 
following a ALC controller preferably allow the algorithm to pseudo-randomly select a harmonic 
from a set of possible harmonics. This capability is useful when there are multiple notes 
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sounding at the same time in a chord. When this VNC is used, it preferably can result in any of 
the relative harmonics described in Fig. 17. These values are only examples of possible values, 
and ones that have been found particularly useful for the types of music addressed up to the time 
hereof. Another example use of the ALC controller is to setup fixed notes. In this case, 
preferably one follows the appropriate ALC controller with Fixed Note values for any desired 
actual note value. This approach is useful in many instances to have a more carefully limited 
song output where a particular interval between notes in the desired music can be achieved. 
Additionally, playing well-known phrases or sequences preferably is possible with this use of the 
ALC controller. One preferably could encode portions of an entire song this way to have a piece 
that closely resembles an existing musical piece. In this example, one preferably could have 
certain parts of the music still interactively generated to enable a song to sound just like an 
existing song (in melody, for example), yet preferably still allow other parts to be different (like 
bass or drums, for example). 

In this manner, you can setup the resulting chord because the ALC value preferably will 
alert the software routine that is processing all of the VNCs to let it know that the following note 
is to be the basis of a chord, and that the next number of harmonic notes will be played at the 
same as the basis note, resulting in a chord being played at once. This example shows one way 
that this can be done effectively. Other values of VNC controllers preferably can be used to 
perform similar musical functions. 

It is important to note that an additional variation can preferably be implemented that 
addresses the natural range, or Tessitura, of a particular instrument type. While the software 
algorithm preferably is taking the VNCs mentioned above and selecting real values, the real pitch 
value preferably can be compared to the real natural range of the instrument type, and the value 
of subsequent VNC outcomes preferably can be inverted accordingly. For example, if the Base 
Note of a given pattern is near the top of the range for a bass instrument Tessitura, any 
subsequent Magic Notes that end up returning a positive number can be inverted to shift the note 
to be below the preceding Base Note. This is a particular optimization that adds subtlety and 
depth to the outcome, as it preferably incorporates the natural range limitations of particular 
instrument types. 
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As a simplified example of Tessitura, Fig. 18 depicts the relative optimal ranges of 
particular instrument types. In the present context, the Tessitura of an instrument preferably is 
the range at which it sounds optimal. Certain sounds in the MIDI sound bank preferably are 
optimized for particular ranges. If you select a bass guitar sound and play very high pitched 
5 notes, the result may not be very good. For higher pitches, a guitar or violin sound may work 
better. Accordingly, when the musical rule algorithm is processing VNCs , the Tessitura of the 
selected instrument type preferably can play a role in the outcome of the real note value 
generated. If the selected instrument is approaching the top edge of its' Tessitura, and the 
musical rule routine comes across a High Note VNC, then the algorithm preferably can be 

,10 designed to bump the generated pitch down an octave or two. Similarly, other VNCs can be 

CI processed with deference to the Tessitura of the selected instrument. 

hi Fig. 19 describes another aspect of this musical process. Musical Key changes preferably 

j[ can be encoded as offsets. By this we mean that given a Key of X, the Key can be shifted up or 
i W down by inserting an offset. Such an offset preferably will transpose everything by the exact 
15 value to result in a musical phrase that is exactly as it was, but now in a different Key. Fig. 19 
p has as examples the Keys of A, C, D, and G. A Key of C preferably would have an offset of 0, A 

an offset of -3, D an offset of +2, and G an offset of +8. As will be appreciated by a student of 
O Musical Theory, the offset preferably corresponds closely with a number of half steps in an 

interval. The interval between C and G is 8 half steps. Other Keys can be similarly achieved. 
20 The use of halfsteps for encoding Keys is advantageous because, as mentioned 

previously, the MIDI language format uses whole numbers to delineate musical pitches, with 
each whole number value incrementally corresponding to a half step pitch value. Other means of 
providing an offset value to indicate Keys can be applied, but in our experience, the use of half 
steps is particularly useful in this implementation because of we are preferably using a MIDI 
25 DSP, and so the output of the Musical Rules preferably will be at least partly MIDI based. 

Fig. 20 describes another Musical Rule that preferably is part of the overall process: 
Mode application. As can be appreciated by a student of Musical Theory, assuming the mode is 
described in terms of sharps (as opposed to flats) the particular placement of sharps is a large part 
of what gives each musical phrase its own identity. In Fig. 20 we give the example of a Lydian 
30 Mode, with Ascending or Descending versions preferably available. Other well established 
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musical modes exist (Ionian, Dorian, Hypodorian, Phrygian, Hypophrygian, Hypolydian, 
Mixolydian, Aeolian, Locrian, etc.) and we only use Lydian here in the interests of space. 
Clearly, the present invention can involve other modes, with corresponding values as those in 
Fig. 20. In cases where a mode is desired that is not a conventional western mode, it is preferable 
to upgrade or alter the soundbank (e.g., located in Flash 49) so that other musical intervals are 
possible. 

Fig. 20 begins with a list of all preferably available notes in the genre of music that we are 
addressing. That is followed by the corresponding preferably natural note values that we term 
Natural Mode. The values of notes in the Natural Mode preferably correspond to the All Notes 
row of notes without the sharps (again assuming that in the present discussion we are defining 
our modes in terms of sharps, and not flats). Then the Lydian mode preferably is listed, which 
does not allow F naturals. In order to decide whether an F natural is to be raised to the next 
available pitch of F sharp, or lowered to the next available pitch of E, an algorithm preferably 
will decide between an ascending or descending transposition. Accordingly, a descendingly 
transposed F natural preferably will be changed to an E, and an ascendingly transposed F natural 
preferably will be transposed to a F sharp. Given that sharps vary from the Natural Mode, the 
use of an ascending Lydian Mode results in music that has more F sharps, and is thus more 
aggressively Lydian. This general concept is evident in other Modes as well, with ascending 
transpositions typically being more aggressive than descending transpositions. 

At this point we will go through a detailed example of the Musical Rule portion of the 
algorithm, using Fig. 21 as the example. This discussion will incorporate the earlier discussions 
of the preceding figures, to demonstrate how a preferred embodiment of the present invention 
preferably incorporates them. 

Fig. 21 depicts the data as it preferably exists between each of the numbered steps 2 - 6 in 
Fig. 16. The Musical Notation is represented to clarify the overall concept, as well as to indicate 
a simplified example of the preferable format the data can take in the software routine. 

Beginning at the top row, there is a collection of predefined VP Sub-Blocks that 
preferably can advantageously be indexed by music Style and/or length. These blocks preferably 
are of variable sizes and preferably are stored in a hexadecimal format corresponding to the 
notation of pitch (recognizing that in certain embodiments the pitch information of a VP does not 
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represent actual pitch characteristics, but VNC data as discussed above), velocity, and duration of 
a MIDI file (the preferable collection of predefined VP-Sub-Blocks is discussed in more detail 
below with reference to Figs. 22 - 23). As shown in the top row of Fig. 21, Rests preferably are 
also available in this collection of available patterns. This collection of indexed Sub-Blocks 
5 preferably is used by a software routine to construct Virtual Patterns (VPs). As mentioned earlier, 
certain alternative embodiments preferably involve using algorithmic block rules to generate the 
collection of Sub-Blocks. Such algorithmic rules preferably are configured to accept the music 
style and instrument type as inputs to then output a collection of Sub-Blocks that are appropriate 
for that style/instrument combination. Whether the Sub-Blocks are selected from predefined 

10 collection, or generated on the fly with an algorithm, they preferably are organized into a VP. 
Q VPs preferably are a collection of Sub-Blocks that have been assembled by the routine into 
|y preferably consistently-sized groupings. 

J After step 2 of Fig. 16 is applied, we preferably have a VP. The second row of Fig. 21 

FU (VP) depicts an example VP that is 2 bars long, and composed of the following sequence: Base 
15 Note, Magic Note 1, Magic Note 0, High Note, and another Base Note. Note that at this time the 

11 rhythm of the part preferably is in place, and the value of each note is conceptually the embedded 
VNC information. If the VP is played at this point, the output would likely not be pleasing. The 

C3 right column of row 2 depicts the format that this data preferably is stored in; as is discussed 
elsewhere in this disclosure, this format is remarkable similar to MIDI format data, with one 

20 exception being that the VNC information preferably is implicitly embedded in the data stream. 

The third row (NCP) depicts the same data after step 3 of Fig. 16 is applied. The VNCs 
embedded in the VP from row 2 preferably have been interpreted by the routine with the help of 
pseudo-random selections from the possible VNC values. Thus, for the first Base Note in row 2, 
we have a real note value of E in row 3, and for the Magic Note Type 1 of row 2 we have 

25 decremented the previous Base Note two half steps to a D in row 3. For the Magic Note Type 0 
we have adjusted the previous value by 0, resulting in another D. This goes on through the VP, 
and the result is clear in row 3. At this point, we preferably have the basic musical information 
that will end up in the song, except that the Chord and Mode transpositions preferably have not 
yet been made. 
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The fourth row in Fig. 21 (PwT) depicts the data stream after step 4 of Fig. 16 is applied. 
As can be seen, the NCP of row 3 has been transposed down. This is to allow the particular 
pattern being constructed to preferably conform to a particular Tonic note, thus placing it into a 
suitable chord preferably to match the other elements of the musical piece. This feature allows 
5 different portions of the melody preferably to conform to different tonic notes, thus preferably 
proceeding through a chord progression, while ensuring that all instruments preferably conform 
to the same chord progression. 

Row 5 of Fig. 21 (PwTM) takes the pattern of notes and preferably conforms it to a 
particular Mode (e.g., Ionian, Dorian, Hypodorian, Phrygian, Hypophrygian, Lydian, Hypolydian, 
10 Mixolydian, Aeolian, Locrian, etc.) preferably as well as a particular Mode type (like 
l n descending, ascending, etc.). A more complete list of musical modes and mode types has been 
O prepared by Manuel Op de Coul (available on the world wide web at: 
^! www.x s4all.nl/~huygensf/doc/modename.html) and is hereby incorporated herein by reference. 
Sj The conformation of the pattern of notes to a particular Mode preferably is done in a manner 
B consistent with Fig. 20, discussed above. In the example of Fig. 21, the resulting musical phrase 
O is very similar to that of Row 4, except the notable difference of the C sharp being reduced to a 
h C. This is because there is no such C sharp in the Lydian mode, and so it's removal is preferably 
~q required at this step. If the Modal adjustment were using the Lydian ascending mode, which is 
f|| more aggressively ascending because there are more sharps, this C sharp would have preferably 
20 'rounded up' to the next Lydian note of D. But, since in this example we are using a Lydian 
descending mode, the C sharp is preferably 'rounded-down 5 to a C. 

The final row of Fig. 21 (RP) indicates the point when the musical phrase preferably can 
be globally transposed up or down the scale. This is advantageous in the case where a global 
pitch adjustment feature is desired to preferably allow the user to quickly and easily shift the 
25 pitch of a song up or down (such as is discussed in an earlier example of the Pitch/Tempo key 
used in combination with the Up/Down keys). The example of Row 6 shows a transposition of 2 
half steps. As with all the rows of this figure, this can be seen in the musical notation, as well as 
the software notation, where the third pair of numbers can be seen to increment by a value of 
two, for each line. 
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There are instances where certain elements of the music preferably do not need the 
musical rules discussed above to be invoked. For example, drum tracks preferably do not 
typically relate to Mode or Key, and thus preferably do not need to be transposed. Additionally, 
many instrument types such as drums, and MIDI effects, preferably are not arranged in the MIDI 
sound bank in a series of pitches, but in a series of sounds that may or may not resemble each 
other. In the example of drums, the sound corresponding to C sharp may be a snare drum sound, 
and C may be a bass drum sound. This means that in certain cases, different levels of the process 
discussed above in reference to Fig. 21 preferably may be advantageously bypassed in these 
cases. 

The collection of sub-blocks discussed above, from which VPs preferably are constructed, 
can be better understood in light of Figs. 22 and 23. 

Fig. 22 depicts an example of the rhythmic variations that preferably are possible, based 
on example durations of 1 or 2 quarter notes. The first row indicates the 4 possible variations, 
given a few basic conditions: that the eighth note is the smallest unit, the length is 1 quarter note, 
and that all full rests are indicated separately as 'empty'. The second row in Fig. 22 lists the 
possible variations, given similar variations: that the eighth note is the smallest unit, that any 
variations in the first row are not included, and that the length is 2 quarter notes. 

One way to create a set of rhythmic variations such as those in Fig. 22 preferably is to put 
the variation data into MIDI event format. This approach preferably involves using a MIDI 
sequencer software tool (such as Sonar from Cakewalk, and Cubase from Steinberg) to generate 
the rhythmic blocks. This preferably allows the use of a variety of input methods (e.g., a 
keyboard controller, a MIDI wind controller, a MIDI guitar controller, etc.), and further 
preferably allows the intuitive copying, pasting, quantizing, and global characteristic adjustments 
(e.g., selecting multiple events and adjusting the pitch for all). Then, the MIDI events preferably 
can be exported as a MIDI file (possibly 1 file for each instrument group). Finally, a software 
batch file program preferably can be written to open the MIDI file and parse out the substantial 
header information, as well as any unneeded characteristic information (such as controller or 
patch information), and preferably output the optimized data into a file that is suitable to include 
in the source code (e.g., ASCII text tables). The use of the sequencing tool preferably enables 
one to quickly generate a variety of appropriate rhythmic blocks for a given instrument type, 
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since the vast array of MIDI controller devices are available that can mimic the characteristics of 
a particular instrument type. For example, one can use a MIDI guitar controller to strum in 
patterns for a guitar type of instrument group. 

The example of Fig. 22 is simplified to convey a concept; that all rhythmic variations 
5 covering up to two quarter notes (given the conditions discussed above) preferably can be 

organized very efficiently according to rhythmic density. Fig. 22 teaches an advantageous way to 
efficiently organize the set of blocks used to construct a VP shown in Fig. 15. If the example of 
Fig. 22 were expanded to include additional rows for rhythmic blocks with longer durations, 
given conditions such as those described above that are consistent across the rows, then each 
JO subsequent row would have patterns of less density than those above it. This is because of the 

$3 condition that each row does not include any of the variations present in rows above it, and 

ill 

y because the duration of the pattern increases for each subsequent row. Thus, there is a direct 
relationship between the example shown in Fig. 22 and the relative rhythmic density of patterns 
used to make a VP. 

£5 Clearly, if any of the conditions described in Fig. 22 were changed, e.g., if a sixteenth 

if""! 

IZ note were the smallest unit or full rests were indicated with a pattern containing a rest, then 

4j preferably the number of variations would be different. While the number would be different, 

O the desirable effects of organizing patterns based on this concept of rhythmic density would 

ru 

remain. 

20 In addition to efficiency, such an approach to organizing the available rhythmic blocks 

preferably enables the use of rhythmic density as an input to a software (e.g., algorithmic 
function) or hardware (e.g., state table gate array) routine. Thus, one preferably can associate a 
relative rhythmic density with a particular instrument type and use that rhythmic density, possibly 
in the form of a desired block length, preferably to obtain a corresponding rhythmic block. This 

25 preferably can be repeated until a VP is complete (see Fig. 15). The VP preferably can thereby 
be constructed with a desired relative rhythmic density. This is particularly useful because it 
preferably allows the creation of VPs with almost limitless variations that have rhythmic 
characteristics preferably generally corresponding to a given instrument type. 

As will be apparent to one of ordinary skill in the art of MIDI, given the context of VP 

30 generation discussed herein, the rhythmic variations shown in Fig. 22 can be represented in the 
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form of MIDI events. In this case, many of the available characteristics in the MIDI events, such 
as pitch, velocity, aftertouch, etc., preferably might be generically set. Then, additional functions 
for such characteristics preferably can be applied to the MIDI events during the creation of VPs 
to impart additional subtlety to the finished music. Such functions preferably can be fairly 
simple and still be effective. As one example, for a given Style of music (e.g., rock), the velocity 
of any MIDI events in the VP that fall on a particular location in the measure (e.g., the downbeat) 
can be modestly increased. Similarly, in a music Style that generally has a rhythmic swing feel, 
where one or more of the beats in a measure may be slightly retarded or advances, the 
corresponding MIDI events in a VP preferably can be modified so as to slightly adjust the timing 
information. Clearly, these types of simple functions preferably can be selectively applied to 
either a given instrument type, and/or a given musical Style. 

Similar to the concept of using relative rhythmic density as a deterministic characteristic 
in creating algorithmic music, Fig. 23 describes a concept of relative mobility of note pitch. As 
shown in Fig. 23, the vertical axis indicates pitch change, and the horizontal axis indicates time. 
Two example types of melody streams are depicted; the top having a fluid movement through a 
variety of pitches, and the bottom having rather abrupt, discrete changes among a fewer number 
of pitches. Thus, the melody on the top of Fig. 23 has a higher relative mobility of note pitch. As 
can be appreciated by the previous discussion of VNCs, the melody example on the top 
preferably would generally require more Magic Notes to imitate, and the melody example on the 
bottom preferably would generally require more Base Notes and High Notes to imitate. 

This concept preferably applies to most instrument types in a given musical Style as well, 
in that certain instruments have a higher relative mobility of note pitch than others. As an 
example, a bass guitar in a rock Style can be thought of as having a lower relative mobility of 
note pitch compared to a guitar in the same Style. The relationship between relative mobility of 
note pitch and relevant VNC type can be very helpful in creating the collection of predefined 
sub-blocks discussed above, in that it serves as a guide in the determination of actual VNC for 
each rhythmic pattern. When one wants to create a set of rhythmic building blocks for use in a 
particular musical Style and/or instrument type, it is advantageous to consider/determine the 
desired relative mobility of note pitch, and allocate VNC types accordingly. 
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As an additional variation, and in keeping with the discussion above regarding relative 
rhythmic density, an architecture that constructs a VP for a given instrument type and/or musical 
Style preferably can greatly benefit from a software (e.g., algorithmic function) or hardware (e.g., 
state table gate array) routine relating to relative mobility of note pitch. As an example, a 
particular music Style and/or instrument type can be assigned a relative rhythmic density value, 
and such a value can be used to influence the allocation or distribution of VNC types during the 
generation of a VP. 

The use of relative rhythmic density and relative mobility of note pitch in the present 
context preferably provides a way to generate VPs that closely mimic the aesthetic subtleties of 
'real' human-generated music. This is because it is a way of preferably quantifying certain 
aspects of the musical components of such 'real' music so that it preferably can be mimicked 
with a computer system, as disclosed herein. Another variation and benefit of such an approach 
is that these characteristics preferably are easily quantified as parameters that can be changeable 
by the user. Thus a given musical Style, and/or a given instrument type, preferably can have a 
relative mobility of note pitch parameter (and/or a relative rhythmic density parameter) as a 
changeable characteristic. Accordingly, the user preferably could adjust such a parameter during 
the song playback/generation and have another level of control over the musical outcome. 

Various examples of preferred embodiments for the block creation aspects of the present 
invention will now be described. 

Continuing the example presented in Fig. 15, wherein a RP preferably is 2 bars, and a VP 
preferably is comprised of 8 quarter notes (QN), the pattern structure creation example of Fig. 24 
assumes that the particular song generation implementation preferably involves a VP length of 8 
QN, a 2 bar RP, and variably-sized Blocks. While those skilled in the art will appreciate the 
considerable number of advantages arising from the architecture of this preferred embodiment, 
they will additionally appreciate that various adaptations and modifications to these embodiments 
can be configured without departing from the spirit and scope of the invention. 

As shown in Fig. 24, one preferred embodiment of the present invention involves the 
creation of a pattern structure. This pattern structure preferably is comprised of the information 
needed to select the actual Blocks, which in many ways are the fundamental unit of the song 
generation. This example of pattern structure creation involves determining each Block's 
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duration (in a given VP), as well as the group of instruments from which the Block will be 
selected. Following this step, and discussed below, this information preferably is used to directly 
generate the Blocks themselves. 

Pattjnfo is a routine that preferably can be used to generate the pattern structure 
information as part of the creation of a particular VP from Blocks. 

Shift is a multiplier that preferably can be used in a variety of ways to add variation to the 
composed VP; for example, it could be a binary state that allows different Block variations based 
on which of the 2 bars in the RP that a particular Block is in. Other uses of a Shift multiplier can 
easily be applied that would provide similar variety to the overall song structure. 

NumJTypes is the number of instruments, and Num_Sub_Drums is the number of 
individual drums that make up the drum instrument. This latter point is a preferable variation 
that allows an enhanced layer of instrument selection, and it can be applied to other contexts 
other than the drum instrument. Conversely, this variation is not at all necessary to the present 
invention, or even the present embodiment. 

Blockjnd is the Block index, FX_No is for any effects number information. Combi_No 
is an index that preferably points to a location in a table called Comb_Index_List. This table 
preferably is the size of the number of Styles multiplied by the number of instrument types; each 
entry preferably contains: SubStyle_Mask to determine if the particular entry is suitable for the 
present SubStyle, Combijndex to determine the Block length, and Groupjndex to determine 
the group of individual MIDI patches (and related information) from which to determine the 
Block. 

Combijndex preferably points to a table called Style_Type_Combi that preferably 
contains multiple sets of Block sizes. Each Block_Size preferably is a set of Block sizes that add 
up to the length of the SEQ. An example SEQ length is 8 QN. 

Group Jndex preferably points to a table called Style_Group that preferably contains sets 
of MIDI-type information for each group of Styles, preferably organized by MIDI Bank. PC 
refers to Patch Change MIDI information, P refers to variably sized MIDI parameters for a given 
Patch, and GS stands for Group Size. GS for group 1 preferably would indicate how many 
instruments are defined for group 1. 
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One preferable optimization of the execution of this step is to incorporate a pseudo- 
random number generator (PRNG) that preferably will select a particular patch configuration 
from the group identified by GS. Then, as the user elects to change the instrument within a 
particular SubStyle, and within a particular lane, another set of patch information preferably is 
selected from the group identified by GS. This use of a PRNG preferably can also be 
incorporated in the auto-generation of a song, where, at different times, the instrument preferably 
can be changed to provide variation or other characteristics to a given song, Part, SubPart, SEQ, 
RP, VP, etc. There are other areas in this routine process that preferably could benefit from the 
use of a PRNG function, as will be obvious to one of ordinary skill in the art. 

Once the Block duration and instrument patch information preferably are determined for a 
given VP, the virtual Block information preferably can be determined on a Block-by-Block basis, 
as shown in Fig. 25. 

Block JList preferably is a routine that can determine a virtual Block using the Block size, 
and the instrument type. As shown in Fig. 25, Style preferably is a pointer to a table of 
Virtual_Block_Data pointers that preferably are organized by Width (i.e., 1-8 QN) and Group 
(i.e., instrument group). Once the Start_Pointer is determined, the Block data preferably can be 
obtained from a Virtual_Block_Data table. Special cases exist where the Block data may be 
already known; for example, empty Blocks, repeating Blocks, etc. 

Again, as discussed above in connection with the pattern structure generation, the present 
steps of the overall process preferably can use an optional PRNG routine to provide additional 
variety to the Block. Another fairly straightforward extension of this example is to use 'stuffing' 
(i.e.; duplicate entries in a particular table) preferably to provide a simple means of weighting the 
result. By this we refer to the ability to influence the particular Block data that is selected from 
the Virtual JBlockJData table preferably by inserting various duplicate entries. This concept of 
stuffing can easily be applied to other tables discussed elsewhere in this specification, and other 
means of weighting the results for each table lookup that are commonly known in the art can be 
easily applied here without departing from the spirit and scope of the invention. 

Additionally, as one of ordinary skill in the art will appreciate, though these examples of 
preferred embodiments to the various inventive steps involve substantial reliance on tables, it 
would be fairly easy to apply concepts of state machines, commonly known in the art, to these 
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steps and optimize the table architecture into one that incorporates state machines. Such an 
optimization would not depart from the spirit and scope of the present invention. 

Various examples of preferred embodiments for pseudo-random number generation 
aspects of the present invention will now be described. 
5 Some of the embodiments discussed in the present disclosure preferably involve 

maximizing the limited resources of a small, portable architecture, preferably to obtain a complex 
music generation/interaction device. When possible, in such embodiments (and others), 
preferably it is desirable to minimize the number of separate PRNG routines. Although an 
application like music generation/interaction preferably relies heavily on PRNG techniques to 
10 obtain a sense of realism paralleling that of similarly Styled, human-composed music, it is 

iisa 

jh tremendously desirable to minimize the code overhead in the end product so as to allow the 

technology preferably to be portable, and to minimize the costs associated with the design and 
%Q manufacture. Consequently, we have competing goals of minimal PRNG code/routines, and 
f || maximal random influence on part generation. 

W hi addition, another goal of the present technology is preferably to allow a user to save a 

d song in an efficient way. Rather than storing a song as an audio stream (i.e.; MP3, WMA, WAV, 
O etc.), it is highly desirable to save the configuration information that was used to generate the 
2\ song, so that it preferably can be re-generated in a manner flawlessly consistent with the original. 
fW The desirability of this goal can easily be understood, as a 5 minute MP3 file is approximately 
20 5MB, and the corresponding file size for an identical song, preferably using the present 
architecture, is approximately 0.5KB, thus preferably reduced by a factor of approximately 
10,000. In certain preferred embodiments, the sound quality of a saved song is similar to a 
conventional compact disc (thereby demonstrably better than MP3). In this comparison, a 5 
minute song stored on a compact disc might be approximately 50MB; thus the file size of a song 
25 using the present invention is reduced from a compact disc file by a factor of approximately 
100,000. 

Saving the configuration information itself, rather than an audio stream, preferably allows 
the user to pick up where they left off, in that they can load a previously saved piece of music, 
and continue working with it. Such an advantage is not easily possible with a single, combined 
30 audio stream, and to divide the audio into multiple streams would exponentially increase the file 
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size, and would not be realizable in the current architecture without significant trade-offs in 
portability and/or quality. 

Additionally this aspect of the present invention preferably enables the user to save an 
entire song from any point in the song. The user preferably can decide to save the song at the end 
of the song, after experiencing and interacting with the music creation. Such a feature is clearly 
advantageous as it affords greater flexibility and simplicity to the user in the music creation 
process. 

Turning now to Fig. 26, we have a diagram representing the preferable algorithmic 
context for some examples of Pseudo-Random Number Generation (PRNG). Drum Seed (DS) is 
a number that preferably is used as input to a simple PRNG routine to generate DS0-DS4. As 
would be apparent to one of ordinary skill in this art, the number of outputs preferably can be 
varied; we use 4 here for illustrative purposes. The 4 values that are output from the PRNG 
preferably are fed into various parts of the Drum Part Generation Algorithm to provide some 
pseudo-random variation to the drum part. 

It is important to note that if the same seed input to the simple PRNG routine is used a 
plurality of times, the same list of values preferably will be output each time. This is because 
simple PRNG routines are not random at all, as they are a part of a computing system that is, by 
its very nature, extremely repeatable and predictable. Even if one adds some levels of 
complexity to a PRNG algorithm that take advantage of seemingly unrelated things like clocks, 
etc., the end user can discern some level of predictability to the operation of the music 
generation. As can be imagined, this is highly undesirable, as one of the main aspects of the 
device is to generate large quantities of good music. 

One benefit of the preferably predictable nature of simple PRNGs is that, by saving the 
seed values, one preferably can generate identical results later using the same algorithm. Given 
the same algorithm (or a compatible one, preferably), the seeds preferably can be provided as 
inputs and preferably achieve the exact same results every time. Further discussion of the use of 
seeds in the music generation/interaction process is discussed elsewhere in this specification. 

While it is a feature of the present invention to preferably incorporate PRNG that are 
repeatable, there are also aspects of the present invention that preferably benefit from a more 
'truly-random' number generation algorithm. For purposes of clarity, we call this 'complex 
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PRNG\ Using the example of Fig. 26 and 27, if, on a regular basis, the same seed input were 
used for both the Drum part and the Bass part, it might limit the variability of the outcome. 
Another example is that, although preferably when playing a previously saved song, you want A 
and A' to always be the same, when you are generating a new song, it preferably is highly 
desirable that these seed inputs be randomly different. Otherwise the song generation suffers 
from the same repeatability as the song playback. 

One example of a complex PRNG that works within the cost/resource constraints we have 
set, is one preferably with an algorithm that incorporates the timing of an individual user's 
button-presses. For example, from time to time in the process of generating music and providing 
user interaction in that generative process, we preferably can initialize a simple timer, and wait 
for a user button press. Then the value of that timer preferably can be incorporated into the 
PRNG routine to add randomness. By way of example, one can see that, if the system is running 
at or around 33 MHz, the number of clocks between any given point and a user's button press is 
going to impart randomness to the PRNG. Another example is one preferably with an algorithm 
that keeps track of the elapsed time for the main software loop to complete; such a loop will take 
different amounts of time to complete virtually every time it completes one loop because it varies 
based on external events such as user button presses, music composition variations, each of 
which may call other routines and/or timing loops or the like for various events or actions, etc. 
While it preferably is not desirable to use such a complex PRNG in the generation of values from 
seeds, due to repeatability issues discussed above, it preferably can be desirable to use such a 
PRNG in the creation of seeds, etc., as discussed above. As an additional example, such a 
complex PRNG routine can be used to time interval, from the moment the unit is powered up, to 
the moment the c press-it-and-forget-if mode is invoked; providing a degree of randomness and 
variability to the selection of the first auto-play song in Home mode (discussed earlier in this 
disclosure). Of course, this type of complex PRNG preferably is a variation of the present 
invention, and is not required to practice the invention. 

One desirable aspect of the present invention involves the limiting of choices to the end 
user. The various ways instruments can be played are limitless, and in the absence of a structure, 
many of the possible ways can be unpleasant to the ear. One feature of palatable music is that it 
conforms to some sort of structure. In fact, it can be argued that the definition of creativity is 
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expression through structure. Different types of music and/or instruments can have differing 
structures, but the structure itself is vital to the appeal of the music, as it provides a framework 
for the listener to interpret the music. The present invention involves several preferable aspects 
of using seed values in the generation of a piece of music. One preferable way to incorporate 
5 seeds is to use two categories of seeds in a song: 1) seeds determining/effecting the higher-level 
song structure, and 2) seeds determining/effecting the particular instrument parts and 
characteristics. Preferably, the first category of seeds is not user-changeable, but is 
determined/effected by the Style/SubStyle and Instrument Type selections. Preferably, the 
second category of seeds is user-changeable, and relates to specific patterns, melodies, effects, 
r iP etc. The point in this example is that there are some aspects of the music generation that are 

preferably best kept away from the user. This variation allows the user to have direct access to a 
ill subset of the seeds that are used for the music generation, and can be thought to provide a 
J structure for the user to express through. This preferable implementation of the present 
^ discussion of seeds enables a non-musically-trained end user to creatively make music that 
i5 sounds pleasurable. 

U Various examples of preferred embodiments for a simple data structure (SDS) to store a 

% ;£ song of the present invention will now be described. 

The use of PRNG seeds preferably enables a simple and extremely efficient way to store a 
song. In one embodiment of the present invention, the song preferably is stored using the 

20 original set of seeds along with a small set of parameters. The small set of parameters preferably 
is for storing real time events and extraneous information external to the musical rules algorithms 
discussed above. PRNG seed values preferably are used as initial inputs for the musical rules 
algorithms, preferably in a manner consistent with the PRNG discussion above. 
Fig. 28 lists some examples of the types of information in an SDS: 

25 'Application Number' is preferably used to store the firmware/application version used to 

generate the data structure. This is particularly helpful in cases where the firmware is 
upgradeable, and the SDS may be shared to multiple users. Keeping track of the version of 
software used to create the SDS is preferable when building in compatibility across multiple 
generation/variations of software/firmware. 
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'Style/SubStyle' preferably is used to indicate the SubStyle of music. This is helpful 
when initializing various variables and routines, to preferably alert the system that the rules 
associated with a particular SubStyle will govern the song generation process. 

'Sound Bank/Synth Type' preferably indicates the particular sound(s) that will be used in 
5 the song. This preferably can be a way to preload the sound settings for the Midi DSP. 

'Sample Frequency' preferably is a setting that can be used to indicate how often samples 
will be played. Alternatively, this preferably can indicate the rate at which the sample is 
decoded; a technique useful for adjusting the frequency of sample playback. 

'Sample set' preferably is for listing all the samples that are associated with the Style of 
| 0 music. Although these samples preferably may not all be used in the saved SDS version of the 

song, this list preferably allows a user to further select and play relevant samples during song 
|y playback. 

jl 'Key' preferably is used to indicate the first key used in the song. Preferably, one way to 

I y indicate this is with a pitch offset. 

J5 'Tempo' preferably is used to indicate the start tempo of the song. Preferably, one way to 

indicate this is with pulses per quarter note (PPQN) information. 
l ;i 'Instrument' preferably is data that identifies a particular instrument in a group of 

Q instruments. Such as an acoustic nylon string guitar among a group of all guitar sounds. This 

data is preferably indexed by instrument type. 
20 'State' preferably is data that indicates the state of a particular instrument. Examples of 

states are: muted, un-muted, normal, Forced play, solo, etc. 

'Parameter' preferably is data that indicates values for various instrument parameters, 

such as volume, pan, timbre, etc. 

'PRNG Seed Values' preferably is a series of numerical values that are used to initialize 
25 the pseudo-random number generation (PRNG) routines. These values preferably represent a 

particularly efficient method for storing the song by taking advantage of the inherently 

predictable nature of PRNG to enable the recreation of the entire song. This aspect of the present 

invention is discussed in greater detail previously with respect to Figs. 26 and 27. 

Through the use of these example parameters in a SDS, a user song preferably can be 
30 efficiently stored and shared. Though the specific parameter types preferably can be varied, the 
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use of such parameters, as well as the PRNG Seeds discussed elsewhere in this disclosure, 
preferably enables all the details necessary to accurately repeat a song from scratch. It is 
expected that the use of this type of arrangement will be advantageous in a variety of fields where 
music can be faithfully reproduced with a very efficient data structure. 

Fig. 29 depicts a logical flow chart for a preferable general architecture that could be used 
in combination with the SDS to practice the present invention. This flow chart describes the big 
picture for a preferable software/firmware implementation, and describes in more detail how the 
song preferably is efficiently and interactively generated using seed values. 

At the start of Fig. 29, an initial set of seed values preferably is either loaded from a data 
file (e.g., SDS) or determined anew (e.g., using the Complex PRNG approach discussed 
elsewhere in this disclosure). While this set of values preferably can effectively be 
determined/loaded for the entire song at this point, it maybe considered advantageous to only 
determine/load them in sections as needed, preferably to provide a degree of randomness to a 
freshly generated song. Further, as discussed above, the seed values may preferably be arranged 
in two categories, one user-changeable, and the other not. Once at least some seed values 
preferably are determined/loaded, the music for a given song part preferably begins to be 
generated, and the user interface (e.g., display, video output, force-feedback, etc.) preferably can 
be updated accordingly. At any point in this process, if a user input is detected (other than a 
'save' command), such as a change of instrument or effect, the relevant seeds for the part of the 
song currently being changed by the user preferably are updated and the generation of the music 
for the given part preferably continues. If a user input 'save' command is detected, all seeds (not 
just the relevant seeds for the given song part) preferably can be saved to a non-temporary 
storage location, such as Flash memory, a hard drive, or some other writeable memory storage 
location that affords some degree of permanence. This arrangement is desirable because it 
preferably allows a user to listen to most of a song before electing to save it in its entirety. As 
long as there is no user input, the generation of music for a given song part preferably continues 
until the end of song part is detected, at which time the flow preferably proceeds to the next song 
part. At this time, if necessary, the relevant seeds for the next song part preferably are 
determined/loaded. Eventually, when an end-of-song condition preferably is detected, the song 
ends. 
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Various examples of preferred embodiments for a complex data structure to store a song 
of the present invention will now be described. 

In another variation to the present invention, it is contemplated that, for purposes of 
saving and playing back songs, the reliance on seeds as inputs to the musical rule algorithms (see 
SDS discussion above) preferably may be exchanged for the use of Complex Data Structures 
(CDS). In part because of it's efficiency, the seed-based architecture discussed above is desirable 
when forward/backward compatibility is not an issue. However, it has some aspects that may not 
be desirable, if compatibility across platforms and/or firmware revisions is desired. In these 
cases, the use of an alternative embodiment may be desirable. 

As described above, a seed preferably is input to a simple PRNG and a series of values 
preferably are generated that are used in the song creation algorithm. For purposes of song save 
and playback, the repeatability preferably is vital. However, if the algorithm is modified in a 
subsequent version of firmware, or if other algorithms would benefit from the use of the simple 
PRNG, while it is in the middle of computing a series (e.g.; DS0-DS3 in Fig. 26), or if additional 
elements are needed for subsequent music Styles, etc., that involve additional seeds, it is possible 
that the repeatability and backwards-compatibility may be adversely impacted. This means that 
in certain applications of the present invention, preferably in order to allow future upgrades to 
have significant leeway, and in order to maintain backwards-compatibility with songs saved 
before the upgrade, another preferably more complex data structure for saving the song is 
desirable. 

Fig. 30 describes some example parameters to include in such a CDS. In general, the 
difference between this structure and the SDS example described in Fig. 28 is that this preferably 
does not rely on seed values to recreate the song. Instead, this CDS preferably captures more of 
the actual data in the song, resulting in a file size that is larger than the SDS example. The use of 
CDS preferably is still a tremendously more efficient and desirable means of saving a song 
compared to an audio stream, as mentioned above in connection with the seed method. While the 
seed method preferably gives you a size reduction over a typical MP3 audio stream of 10,000, the 
CDS method preferably might give an approximate size reduction of 1,000; for a WAV audio of 
100,000, the size reduction results in 10,000 (or when compared to a compact disc the size 
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reduction is approximately 100,000). While much larger than the seed approach, the CDS 
approach is still advantageous over the audio stream methods of music storage in the prior art. 

While both examples have their advantages, it may also be advantageous to combine 
aspects of each into a hybrid data structure (HDDS). For example, the use of some seed values in 
5 the data structure, while also incorporating many of the more complex parameters for the CDS 
example, preferably can provide an appropriate balance between compatibility and efficiency. 
Depending on the application and context, the balance between these two goals preferably can be 
adjusted by using a hybrid data structure that is in between the SDS of Fig. 28 and the CDS of 
Fig. 30. 

10 In the example of Fig. 30, 'Application Number', ' Style/Sub Style', 'Sound Bank/Synth 

g Type', 'Sample Frequency', 'Sample List', 'Key', 'Tempo', 'Instrument', 'State', and 

|H 'Parameter' are preferable parameters that are described above in reference to Fig. 28. 

tD 'Song Structure' preferably is data that preferably lists the number of instrument types in 

rl j the song, as well as the number and sequence of the parts in the song. 

\i 'Structure' preferably is data that is indexed by part that preferably can include the 

O number and sequence of the sub-parts within that part. 

q 'Filtered Track' preferably is a parameter that preferably can be used to hold data 

:£j describing the characteristics of an effect. For example, it preferably can indicate a modulation 
F U type of effect with a square wave and a particular initial value. As the effect preferably is 
20 typically connected with a particular part, this parameter may preferably be indexed by part. 

'Progression' preferably is characteristic information for each sub-part. This might 

include a time signature, number and sequence of SEQs, list of instrument types that may be 

masked, etc. 

'Chord' preferably contains data corresponding to musical changes during a sub-part. 
25 Chord vector (e.g., +2,-1, etc.), key note (e.g., F), and progression mode (e.g., dorian ascending) 
data preferably are stored along with a time stamp. 

'Pattern' and the sub-parameters 'Combination', 'FX Pattern', and 'Blocks', all 
preferably contain the actual block data and effects information for each of the instruments that 
are used in the song. This data is preferably indexed by the type of instrument. 
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'Nota Bene' preferably is for specifying instruments or magic notes that will be played 
differently each time the song is played. This parameter preferably allows the creation of songs 
that have elements of improvisation in them. 

Additional parameters can preferably be included, for example to enable soundbank data 
associated with a particular song to be embedded. Following this example, when such a CDS is 
accessed, the sound bank data preferably is loaded into non- volatile memory accessible to a DSP 
such that the sound bank data may be used during the generation of music output. 

Fig. 3 1 depicts a preferable example flow chart for the CDS approach discussed above. It 
is similar to Fig. 29, except that at the points in the flow where the Seeds are loaded, determined, 
updated, and/or stored, there are corresponding references to loading, determining, updating, 
and/or storing CDS parameter data corresponding to Song Structure, Structure, Filtered Track, 
Progression, Chord, Pattern, Instrument, State, Parameter, and Nota Bene. 

In certain preferred embodiments the Player 10 is accompanied by a companion PC 
software system designed to execute on a PC system and communicate with Player 10 via a data 
link (e.g., USB 54, Serial I/O 57, and/or a wireless link such as 802.1 lb, Bluetooth, IRDA, etc.). 
Such a PC software system preferably is configured to provide the user with a simple and 
effective way to copy files between the Player 10 and other locations (e.g., the PC hard drive, the 
Internet, other devices, etc.). For example, the companion PC software program preferably 
operates under the MS Windows family of Operating Systems and provides full access to the 
User for all PlayerlO functions and Modes, as well as the local Player memory (e.g., SMC). 
Following this example, a user can connect to the Internet and upload or download music related 
files suitable to be used with the Player 10 (e.g., MIDI, WMA, MP3, Karaoke, CDS, SDS, etc.) 
as well as user interface-related files such as customized user-selectable graphics preferably to be 
associated with music styles or songs on the Player 10. Such a companion PC program 
preferably is also used to enable hardware and/or software housekeeping features to be easily 
managed, such as firmware and sound bank updates. This companion PC software system 
preferably is used to provide the user with an easy way to share music components and/or 
complete songs with other users in the world (e.g., via FTP access, as attachments to email, via 
peer-to-peer networking software such as Napster, etc.). It is important to note the potentially 



61 



royalty-free nature and extreme size efficiency of musical output from the Player 10 lends itself 
well to the Internet context of open source file sharing. 

Various examples of preferred embodiments for hardware implementation examples of 
the present invention will now be described. 
5 Fig. 32 is a block diagram of one portable hardware device embodiment 35 of the present 

invention. The microprocessor (MP 36) controls local address and data busses (MP Add 37 and 
MP Data 38); the universal serial bus interface (USB 39), the smart media card interface (SMC 
40) (as discussed previously, alternatives to SmartMedia, such as other types of Flash or other 
memory cards or other storage media such as hard disk drives or the like may be used in 
10 accordance with the present invention), and a memory such as Flash 41 are preferably on the MP 
g data bus 38; and the MIDI/Audio DSP (DSP 42) is preferably on both the MP address bus 37 and 

MP data bus 38. The SMC interface 40 preferably has a buffer 59 between it and the MP Data 
%y bus 38, and there preferably are keyboard interface 42 (with MP Data Latch 44) and LCD 
III interface 45 associated with the MP busses as well. In this example, the MP 36 can preferably 
y perform as a sequencer to extract timing information from an input data stream and send MIDI 
information (possibly including NRPN-type data discussed elsewhere in this disclosure) to the 
O DSP 42 - The DSP 42 additionally preferably has dedicated address and data busses (DSP Add 
JI] 46 and DSP Data 47) that preferably provide access to local RAM 48 and Flash 49 memories. 
ill The MP 36, DSP 42, FM receiver 50, and Microphone input 5 1 all preferably have some 

20 type of input to the hardware CODEC 52 associated with the DSP 42. 

The connector 53 at the top left of Fig. 32 can be considered as a docking station interface 
or as a pure USB interface or external power interface, preferably complete with interfaces for 
USB 54, power 55, rechargeable battery charge 56, serial I/O 57, and Audio I/O 58. An example 
of a block diagram for a docking station device 70 of the present invention is provided in Fig. 34. 
25 As is shown in Fig. 34, the docking station 70 preferably includes a local microprocessor (LMP 
71), preferably with a USB interface 72, address and data busses (LMP ADD 73 and LMP Data 
74), a MIDI I/O interface 75, and memory such as Flash 76. Additionally, the docking station 
device 70 preferably contains an Audio Codec 77, a Video I/O interface 78, and a Power Supply 
79. 
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The MP 36 in this example is preferably the ARM AT91R40807, though any similar 
microprocessor could be utilized (such as versions that have on-board Flash, more RAM, faster 
clock, lower voltage/lower power consumption, etc.). This ARM core has 2 sets of instructions: 
32bit and 16bit. Having multiple width instructions is desirable in the given type of application 
in that the 16bit work well with embedded systems (Flash, USB, SMC, etc.), and 32bit 
instructions work efficiently in situations where large streams of data are being passed around, 
etc. Other variations of instruction bit length could easily be applied under the present invention. 

For 32bit instructions, the system of the present invention preferably pre-loads certain 
instructions from the Flash memory 41 into the internal RAM of the MP 36. This is because the 
Flash interface is 16bit, so to execute a 32bit instruction takes at least 2 cycles. Also, the Flash 
memory 41 typically has a delay associated with read operations. In one example, the delay is 
approximately 90ns. This delay translates into the requirement for a number of inserted wait 
states (e.g., 2) in a typical read operation. Conversely, the internal RAM of the MP 36 has much 
less delay associated with a read operation, and so there are less wait states (e.g., 0). Of course, 
the internal RAM in this case is 32bits wide, and so the efficiencies of a 32bit instruction can be 
realized. 

As is shown above in the example regarding the wait states of Flash memory 41, there are 
many reasons why it is desirable to try to maximize the use of the internal MP RAM. As can be 
seen from Fig. 32, this example of the present invention preferably does not include an SDRAM 
or RDRAM. While these types of memory means are available to include in such a system, and 
such use would not depart from the spirit and scope of the present invention, in certain portable 
applications, such as depicted in Fig. 32, the use of relatively unnecessary complexity (e.g., 
SDRAM controllers & address logic, etc.) is not preferable. The current example of Fig. 32 
achieves many of the benefits of the present invention, in a simple design suitable for a portable 
device. 

One example of a trade-off associated with complexity and portability is the use of a 
widely available WMA audio decoder algorithm from Microsoft. In this example, when 
operating the ARM MP of Fig. 32 at 32MHz/3.0V, Microsoft's WMA decoding algorithms can 
be incorporated to successfully decode and play a WMA-encoded song in stereo at 44KHz and at 
a sample rate of 128Kbps. However, as discussed elsewhere in this specification, a preferable 
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feature that allows the speed of an audio stream song to be adjusted can also be incorporated. In 
this case, when speeding up the WMA 44KHz song using the speed control, it is possible that the 
system of Fig. 32 may encounter an underrun condition. In this specific example, such cases do 
not occur when the ARM MP 36 is operated at 40MHz/3.0V. However, when operating the MP 
5 36 at 3.0V, a significant performance hit on battery life can occur. So, because the use of the 
WMA at 44KHz in combination with the pitch speed feature seems to be relatively unnecessary, 
this particular example feature can preferably be sacrificed for the benefit of a longer battery life. 
Obviously, one could incorporate variations such as: a better battery system, a speed stepped 
approach that operates at full speed when plugged in and at a slower speed when using batteries, 
10 a more efficient WMA algorithm, etc. However, this example illustrates the point that 
q competing needs can preferably be balanced with performance and portability, 
^ In the example of Fig. 32, the MP 36 contains 136KB of internal RAM. The 

i[| performance/portability balance described above dictates that one preferably must play certain 
it J tricks on the system to maximize the efficiency of the 136Kb RAM. For example, the memory 
p range can preferably be divided into different regions for buffering, programs, etc., and in real- 
time modes (e.g., WMA playback), the percentage used for the code can preferably be 
p maximized and the percentage used for buffers preferably minimized, 
p. Another alternative embodiment can be an MP 36 with preferably more internal RAM 

I U (for example, 5 1 2KB) which would preferably allow a reduction or elimination of the use of 
20 Flash memory 41 . Such a system may add to the total cost, but would reduce the complexities 
associated with using Flash memory 41 discussed above. 

Another variation is the example shown in Fig. 33, which describes the local DSP area of 
Fig. 32 wherein preferably additional RAM 90 is accessible on the DSP bus. Such additional 
RAM can be preferably used to temporarily store large MIDI sound loops that can be played 
25 quickly and often. RAM 90 can also preferably be used to temporarily store one or more sound 
streams (e.g., PCM) that can thus be preloaded and played quickly. Without this feature, each 
sample might need to be managed and sent by the MP to the DSP every time it is used, in real 
time. While this is not a problem in certain implementations of the present invention, it may be 
advantageous to use such additional RAM 90 as shown in Fig. 33 when extensive usage of sound 
30 streams is desired. In such cases, a typical size of the RAM 90 in Fig. 33 might preferably be 
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5 12KB, and the MP will preferably only need to send an instruction to the DSP to play the 
locally stored stream. 

Continuing the discussion of the architecture shown in Fig. 32, Fig. 35 describes one 
example for an address map for the internal RAM of the MP. Starting from the bottom of the 
map, the bottom two sections represent the libraries and routines that are often used, and are 
always loaded in RAM. The midsection labeled "multi-use" is preferably used for WMA/MP3 
related code during the playback of WMA, MP3, and/or other similarly encoded audio stream 
songs from the SMC. However, during other modes, such as eDJ mode, this midsection is 
preferably used for Block, Song, and SMC buffers. The next section above this area is preferably 
used as a buffer for streaming media. This section is preferably divided into a number of 
subsections, and each subsection is preferably sent to the DSP device at regular intervals (e.g., 
5.8ms @44.1kHz, 16bit, 1Kb blocks). Above this, at the top of Fig. 35, is the general-purpose 
area of MP RAM preferably used for variables and general buffers. 

In this example, when the Player is not operating in a WMA/MP3/etc. mode, the 'multi- 
use' mid section can preferably be used for at least three types of buffers. Block buffers are 
preferably used by the eDJ Block creation algorithms (e.g., Figs. 24 and 25) to store Block data 
during operation. Song buffers are preferably used by the eDJ algorithms to store Song data (see 
Fig. 15) after Block creation has occurred. This Song data is preferably fed out to the DSP 
device shown in Fig. 32. SMC buffers are preferably used for write operations to the SMC. 

SMC is a Flash memory technology that doesn't allow the modification of a single bit. 
To perform a write to the SMC, one must read the entire SMC Block, update the desired portion 
of the SMC Block, and then write the entire SMC Block back to the SMC. In the interests of 
efficiency, the currently used SMC Block is preferably maintained in the SMC buffers. 

As one can appreciate, the system configuration described above cannot simultaneously 
playback large WMA/MP3 streams while also writing to the SMC. This is because the two 
functions preferably alternatively use the same memory region. This is a creative use of limited 
resources, because it is preferably a relatively unusual condition to be reading WMA/MP3 while 
writing SMC at the same time. So the code is preferably arranged to swap in and out of the same 
location. Such an arrangement allows maximized use of the limited resources in a portable 
environment such as Fig. 32. 
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However, in a more powerful environment (with additional resources, and/or faster clock 
speed), this 'multi-use' of a shared region of memory could preferably be eliminated, and 
simultaneous use of WMA/MP3 and the Record function could easily be implemented. 
Obviously, these additional enhancements for use in a portable environment do not limit the 
other aspects of the present invention. 

The system discussed above is portable, but preferably has extremely high-quality sound. 
On a very basic level, this is partly due to the use of a sound chip that typically would be found in 
a high-end sound card in a PC system. The SAM9707 chip is preferable because of its excellent 
sound capabilities, but this has required it be adapted somewhat to work in the portable example 
discussed herein. 

One characteristic of the SAM9707 is that it is typically configured to work with SDRAM 
in a sound card. This SDRAM would typically hold the MIDI sound banks during normal 
operation. Such sound banks are preferably a critical part of the final sound quality of music that 
is output from a DSP-enabled system. In fact, another reason why this particular chip is 
preferable is to allow custom sounds to preferably be designed. 

In the example above of a portable system, SDRAM adds significantly to the power 
requirements, as well as the address logic. Accordingly, it is desirable to use a variation of the 
configuration, preferably using Flash as local DSP sound bank storage (see Fig. 32). The use of 
Flash memory as local DSP storage is a bit problematic because, in order to allow a user to 
upgrade the sound banks of their portable Player system, the local DSP Flash memory preferably 
needs to be accessible from the MP side of the architecture. Such access could be gained through 
the use of a dual-port Flash memory, with memory access from both the DSP busses and the 
ARM MP busses, but such a dual port architecture would add expenses and complexity to the 
system. 

The problem of reaching a proper balance between maintaining the low power/simple 
architecture on one hand, and providing high quality, upgradeable, music sound banks on the 
other hand, is preferably solved by adapting a mode of the DSP chip, and preferably customizing 
the address logic in such a way that the DSP can be "tricked" into providing the access from the 
MP side to the local DSP Flash memory. 
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Fig. 36 describes an example of an addressing space for the DSP local RAM and Flash 
storage. Starting from the bottom of the map, the first section is preferably for Firmware, and 
this is typically addressed to a Flash memory region. The next section is preferably the sound 
banks, and this is also typically addressed to a Flash region. The third section is preferably 
addressed to Flash when signal A24 is active (in this case, A24 is active low, or = 0). Signal A24 
is discussed more below. The fourth section, with starting address 0x1000000, is preferably a 
32Kb block that is not addressed to any memory locations. The fifth section is preferably also 
32Kb and is preferably addressed to the local DSP RAM (labeled RAM a ). Note that when 
addressing this area, signal A24 is preferably high. The seventh section, with starting address 
0x2000000, is preferably a 32Kb section that preferably resolves to RAM (labeled RAM b ). The 
two 32Kb RAM regions are preferably combined into the 64Kb local RAM. 

So the first variation of the present invention, to the general use of the DSP chip, 
especially in its intended context of a sound card for a PC, is the address location of the RAM a . 
This region is selected to allow a very simple address decode logic arrangement (preferably 
external to the DSP) so that the assertion of A24 will preferably toggle the destination of RAM a 
addresses, between DSP-local RAM and DSP-local Flash memories. This variation preferably 
involves a firmware modification that will allow the specific location of RAM a to be configured 
properly preferably by default at startup time. There are other ways to modify this location after 
initialization, but they are more complicated, and therefore are not as desirable as the present 
method. 

Another variation to the intended context of the DSP chip address map preferably 
involves a creative implementation of the DSPs BOOT mode to allow the sound banks to be 
upgraded, even though the sound banks are preferably located in the local Flash memory of the 
DSP chip; a location not typically accessible for sound bank upgrades. 

In this example, the BOOT mode of the DSP causes an internal bootstrap program to 
execute from internal ROM. This bootstrap program might typically be used while upgrading the 
DSP firmware. As such, the internal bootstrap expects to receive 256 words from the 16bit burst 
transfer port, which it expects to store at address range 0100H-01FFH in the local memory, after 
which the bootstrap program resumes control at address 0100H. This relatively small burst is 
fixed, and is not large enough to contain sound banks. Furthermore, it does not allow the 



67 



complex Flash memory write activities, as discussed above in connection with the SMC. Since 
our design preferably uses Flash instead of SDRAM, we have found it highly desirable to use this 
bootstrap burst to load code that preferably 'tricks' the ROM bootstrap to effectuate the transfer 
of special code from the ARM MP bus to the RAM. This special code is then used to preferably 
effectuate the transfer of sound bank upgrade data from the ARM MP bus to the Flash memory. 

Fig. 37 is a simple truth table that provides additional information on this unusual use of 
the DSP bootstrap mode addressing scheme. Fig. 38 is a more detailed truth table that highlights 
the usefulness of our unusual DSP address logic, including the preferable use of the A24 signal 
controllable by the ARM MP, preferably by use of the BOOT signal. 

hi the present example, the A24 address line generated by the DSP is preferably altered by 
the BOOT signal controlled by the MP before being presented to the address decoding logic of 
the DSP local memory. This arrangement permits the MP to preferably invert the DSP's 
selection of RAM and Flash in BOOT mode, and thus allows the RAM to preferably be available 
at address 0x100 to receive the upgrade code. 

Additional variations to the hardware arrangement discussed above can be considered. 
For example, if the power level is increased, and the MP performance increased, the DSP could 
be substituted with a software DSP. This may result in lower quality sounds, but it could have 
other benefits that outweigh that, such as lower cost, additional flexibility, etc. The DSP could 
similarly be replaced with a general-purpose hardware DSP, with the result of lower quality 
sounds, possibly outweighed by the benefits of increased portability, etc. The MP could be 
replaced with one having a greater number of integrated interfaces (e.g., USB, SMC, LCD, etc.), 
and/or more RAM, faster clock speed, etc. With a few changes to some of the disclosed 
embodiments, one could practice the present invention with only a DSP (no separate MP), or a 
dual die DSP/MP, or with only an MP and software. Additionally, the SMC memory storage 
could be substituted with a Secure Digital (SD) memory card with embedded encryption, and/or 
a hard disk drive, compact flash, writeable CDROM, etc., to store sound output. Also, the LCD 
could be upgraded to a color, or multi-level gray LCD, and/or a touch-sensitive display that 
would preferably allow another level of user interface features. 

Yet a further variation of the present discussion preferably can be the incorporation of a 
electromagnetic or capacitive touch pad pointing device, such as a TouchPad available from 
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Synaptics, to provide additional desirable characteristics to the user interface. Both the touch pad 
and the touch sensitive display mentioned above can be used to provide the user with a way to 
tap in a rhythm, and/or strum a note/chord. Such a device preferably can be used to enable a 
closer approximation to the operation of a particular instrument group. For example, the touch 

5 pad can be used to detect the speed and rhythm of a user's desired guitar part from the way the 
user moves a finger or hand across the surface of the touch pad. Similarly, the movement of the 
users hand through the x and y coordinates of such a pointing device can be detected in 
connection with the pitch and/or frequency of an instrument, or the characteristics of an effect or 
sample. In another example, a touch pad pointing device can also be used to trigger and/or 

10 control turntable scratching sounds approximating the scratching sounds a conventional DJ can 

H generate with a turntable. 

O As can be seen in Fig. 32, one example of a DSP that can be used in the context of the 

*jj present invention is the SAM9707 chip available from the Dream S. A. subsidiary of Atmel 
4= Corporation. This particular chip is able to handle incoming MIDI and audio stream information. 
Q When incorporating the DSP into a generative/interactive music system, it is highly 

q desirable to synchronize the MIDI and audio streams. A sample preferably has to play at exactly 
% the right time, every time; when the audio stream components get even slightly out of sync with 
=j: the MIDI events, the resulting musical output generally is unacceptable. This delicate nature of 
HI mixing audio streams and MIDI together in a generative/interactive context is worsened by the 
20 nature of the Flash read process, in that SMC technology is slow to respond, and requires 

complex read machinations. It is difficult to accurately sync MIDI events with playback of audio 
from a Flash memory location. Because of the delay in decoding and playing a sample 
(compared to a MIDI event), there is a tradeoff in either performing timing compensation, or 
preloading relatively large data chunks. Because of these issues, it is preferable to configure a 
25 new way to use MIDI and audio streams with the DSP chip . While this aspect of the present 

invention is discussed in terms of the DSP architecture, it will be obvious to one of ordinary skill 
in the art of MIDI/audio stream synchronization that the following examples apply to other 
similar architectures. 

Fig. 39 shows a simplified logical arrangement of the MIDI and Audio Streams in the 
30 music generation process. The two inputs going to the Synth are preferably merged and turned 
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into a digital audio output signal. This output signal is then preferably fed to a digital to analog 
converter (DAC), from which is preferably output an analog audio signal suitable for use with 
headphones, etc. Note that in our example, the Audio stream input to the Synth might typically 
come from a relatively slow memory means (e.g.; Flash memory), while the MIDI input to the 
Synth might come from a relatively fast memory means (e.g.; SRAM buffer). 

The two inputs to the Synth device preferably may actually share a multiplexed bus; but 
logically they can be considered as separately distinguishable inputs. In one example, the two 
inputs share a 16bit wide bus. In this case, the MIDI input preferably may occupy 8bits at one 
time, and the audio stream input preferably may occupy 16bits at another time. Following this 
example, one stream preferably may pause while the other takes the bus. Such alternating use of 
the same bus can mean that relatively small pauses in each stream are constantly occurring. Such 
pauses are intended to be imperceptible, and so, for our purposes here, the two streams can be 
thought of as separate. 

Fig. 40 shows a simplified MIDI/Audio Stream timeline. Assume that Fig. 40 is the 
timing for the very beginning of a Block. It follows then, that in this case, the designer wants to 
play a MIDI note, starting 250ms after the beginning of the Block, that will last 500ms. The 
duration of the note relates to the type of note being played, for example, if it is a quarter note in 
a 4/4 time, and with a measure duration of 2 seconds, a 500ms would correspond to a quarter 
note duration. Also indicated in Fig. 40, that an Audio stream event such as a short voice sample 
"yo" will preferably be synchronized to occur in the middle of the MIDI event. Bear in mind that 
this method allows the sample to preferably be quantized to the music, in the sense that it can 
involve the subtle correction of minor timing errors on the part of the user by synchronizing the 
sample to the musical context. 

In this example, largely because of the constraints of the system architecture example 
discussed above, this is not a trivial thing to accomplish consistently and accurately using 
conventional techniques. Keeping in mind that the MIDI event is preferably generated almost 
instantly by the Synth chip, whereas the Audio Stream event could require one or more of the 
following assistance from the ARM MP: fetching a sound from SMC, decompressing (PCM, 
etc.), adding sound effects (reverb, filters, etc.). 
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In this example, it is highly desirable to create a special MIDI file preferably containing 
delta time information for each event, and specialized non-registered parameter numbers 
(NRPNs). This feature is especially advantageous when used with a Sample List (as mentioned 
above) because the name of a particular sample in a list is preferably implicit, and the NRPNs 
5 can preferably be used to trigger different samples in the particular sample list without explicitly 
calling for a particular sample name or type. This type of optimization reduces the burden of 
fetching a particular sample by name or type, and can preferably allow the samples used to be 
preloaded. 

Fig. 41 depicts an example of a MIDI NRPN that can be advantageously incorporated into 
10 the present invention to allow efficient synchronization of MIDI events with audio samples and 
t ^ effects. The left column depicts the hexadecimal values making up the MIDI NRPN stream. As 

anyone who works with the MIDI Specification (previously incorporated by reference) will 
hi appreciate, the MIDI NRPN is a data structure that enables custom use of portions of a MIDI 
jj stream. Accordingly, it can preferably be used to trigger specific custom events for a given 
=f§ architecture. 

In Fig. 41, the first hexadecimal value 'BO' preferably indicates a channel number, as well 
?[[ as that it is a MIDI controller command. This can be used to assist with routing in a multi- 
w channel arrangement. In our example, for purposes of simplicity this is set channel 0. The 
0 second value '63 ' preferably indicates that this particular stream contains NRPN information for 
% a particular controller (e.g., 'A'). In this example, NRPN Controller A can be understood by the 
firmware/software to indicate an audio sample type. The third row value of '40' preferably is 
data that corresponds to the controller, and in our example this data can be understood to 
describe the type of sample. As an example of the usefulness of this arrangement, if the type is 
set to 'long', then the firmware/software preferably can arrange to load the sample in chunks. 
25 The fourth row preferably indicates a delta time, in MIDI clicks, that can preferably be used to 
precisely time the next event. In our example, this delta time is set to '00' for simplicity. The 
fifth row preferably indicates that this particular stream contains NRPN information for a 'B' 
controller. In this example, NRPN Controller B can be understood by firmware/software to 
indicate an audio effects type. This is because we have found it advantageous to use a MIDI DSP 
30 component that includes certain audio effects that can be controlled effectively in a timely 
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manner via MIDI NRPNs. The sixth row preferably indicates the identification of the particular 
audio effects type called for in this NRPN example. While '00' is shown for simplicity, it should 
be understood that the value in this part of the MIDI stream can be interpreted by the 
firmware/software to select a particular effect from the available audio effects for a particular 
architecture. The seventh row preferably indicates another delta time that can be interpreted as a 
delay. The eighth row preferably can be used to indicate to the firmware/software the 
identification of a register to store the NRPN Controller A value shown in row nine. The ninth 
row uses '03' as an example; this preferably can be interpreted to mean the third audio sample in 
a list corresponding to a song (see 'Sample List' in Figs. 29 and 30). Value '00' can be used 
effectively to instruct the firmware/software to select a sample from the sample list randomly. 
The tenth row of Fig. 41 is preferably another delta time value (e.g., '00' is zero MIDI clicks). 
The eleventh row preferably can be used to indicate to the firmware/software the identification of 
a register to store the NRPN Controller B value shown in row 12. The twelfth row uses '07' as 
an example; in the present discussion this preferably can be interpreted by the firmware/software 
to instruct the MIDI DSP to apply a particular audio effect among those available. 

Fig. 42 is a simplified depiction of a special MIDI type file that is an example of the 
arrangement of the data being sent from the ARM MP to the DSP preferably via the MIDI input 
stream, along the lines of the example above. 

The top of the figure indicates that the first information in this file is a delta time of 
250ms. This corresponds to the 250ms delay at the beginning of Fig. 40. Next in the file 
depicted in Fig. 42 is general MIDI information preferably indicating a note on event for channel 
1, pitch C. This corresponds to the time in Fig. 40 when 250ms has passed. Next in Fig. 42, we 
have another 250ms delta time. This represents the time between the previous MIDI event, and 
the next Audio Stream event at time 500ms in Fig. 40. Next, in Fig. 42 we have an NRPN 
message that preferably indicates to the Synth chip that it needs to play the audio stream event X, 
with various parameters P, and various effects E. This corresponds to the audio stream event 
('yo') depicted in Fig. 40. Then, in Fig. 42 we have another delta time event of 250ms, followed 
by the general MIDI information preferably indicating a note off event for channel 1, pitch C. 
This final step corresponds to the end of the MIDI event in Fig. 40 (e.g., 'C quarter note). 
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In the previous example, the delta time preferably can be different (and often is) each time 
in the special MIDI type file. In our simplified example, and because we want to make the 
timing relationship with a quarter note, etc., more clear, we have used the same 250ms value 
each time. Obviously, in a more complex file, the delta time will vary. 

As previously described, voice and other audio samples may be encoded, stored and 
processed for playback in accordance with the present invention. In certain preferred 
embodiments, voice samples are coded in a PCM format, and preferably in the form of an 
adaptive (predictive), differential PCM (ADPCM) format. While other PCM formats or other 
sample coding formats may be used in accordance with the present invention, and particular 
PCM coding formats (and ways of providing effects as will be hereinafter described) are not 
essential to practice various aspects of the present invention, a description of exemplary ADPCM 
as well as certain effects functions will be provided for a fuller understanding of certain preferred 
embodiments of the present invention. In accordance with such embodiments, a type of ADPCM 
may provide certain advantages in accordance with the present invention. 

As will be appreciated by those of skill in the art based on the disclosure herein, the use 
of ADPCM can enable advantages such as reduced size of the data files to store samples, which 
are preferably stored in the non-volatile storage (e.g., SMC), thus enabling more samples, song 
lists and songs to be stored in a given amount of non- volatile storage. Preferably, the coding is 
done by a packet of the size of the ADPCM frame (e.g., 8 samples). For each packet, preferably a 
code provides the maximum value; the maximum difference between two samples is coded and 
integrated in the file. Each code (difference between samples (delta_max) and code of the packet 
(diffmax)) uses 4 bits. In accordance with this example, the data/sample is therefore (8*4+4)/8 
= 4.5 bits/sample. 

As will be appreciated, this type of coding attempts to code only what is really necessary. 
Over 8 samples, the maximum difference between two samples is in general much less than the 
possible dynamic range of the signal (+32767A32768), and it is therefore possible to allow 
oneself to code only the difference between samples. Preferably, the ADPCM is chosen to be 
suitable for the voice that is relatively stationary. By predictive filtering, it is possible to reduce 
the difference between a new sample and its prediction. The better the prediction, the smaller the 
difference, and the smaller the coding (the quantization) that is chosen, taking into account the 
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average differences encountered. While it will be appreciated that this approach requires 
additional computation ability for the prediction computation, it is believed that this approach 
provides significant advantages in reduced storage for samples with acceptable sample coding 
quality in accordance with the present invention. While more conventional or standardized 
ADPCM desires to offer a coding time without introducing delays, with the present invention it 
has been determined that such attributes are not essential. 

A simple coding without prediction and taking into account only average values of 
differences encountered reacts very poorly to a non-stationary state (e.g., each beginning of a 
word or syllable). For each new word or syllable, a new difference much greater than the average 
differences previously encountered typically cannot be suitably coded. One therefore tends to 
hear an impulse noise depending on the level of the signal. Preferably, the solution is therefore to 
give the maximum value of the difference encountered (one therefore has a delay of 8 samples, a 
prediction is thus made for the quantizer only) for a fixed number of samples and to code the 
samples as a function of this maximum difference (in percentage). The coding tends to be more 
optimal at each instant, and reacts very well to a non-stationary state (each beginning of a word 
or syllable). Preferably, the coding is logarithmic (the ear is sensitive to the logarithm and not to 
the linear), and the Signal/Noise ratio is 24 db. In preferred embodiments, this function is put in 
internal RAM in order to be executed, for example, 3 times more rapidly (one clock cycle for 
each instruction instead of three in external flash memory). 

Preferably certain effects may be included in the ADPCM coding used in certain 
embodiments of the present invention. For example, a doppler effect may be included in the 
ADPCM decoding since it requires a variable number of ADPCM samples for a final fixed 
number of 256 samples. As is known, such a doppler effect typically consists of playing the 
samples more or less rapidly, which corresponds to a variation of the pitch of the decoded voice 
accompanied by a variation of the speed together with the variation of pitch. In order to give a 
natural and linear variation, it is desirable to be able to interpolate new samples between two 
other samples. The linear interpolation method has been determined to have certain 
disadvantages in that it tends to add unpleasant high frequency harmonics to the ear. 

The method traditionally used consists of over-sampling the signal (for example, in a 
ratio [of] 3 or 4) the signal and then filtering the aliasing frequencies. The filtered signal is then 
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interpolated linearly. The disadvantage of this method is that it requires additional computational 
ability. Preferably, in accordance with certain embodiments, a technique is utilized that consists 
of interpolating the signal with the four adjacent samples. It preferably corresponds to a second 
order interpolation that allows a 4.5 dB gain for the harmonics created by a linear interpolation. 
While 4.5 db seems low, it is important to consider it in high frequencies where the voice signal 
is weak. The original high frequencies of the voice are masked by the upper harmonics of the low 
frequencies in the case of the linear method, and this effect disappears with second order 
interpolation. Moreover, it tends to be three times faster than the over-sampling method. 
Preferably, this function is put in internal RAM in order to be executed, for example, 3 times 
more rapidly (one clock cycle for each instruction instead of three in external flash memory). 

Also in accordance with preferred embodiments, an electronic metronome function is 
included, which consists of counting the period number (the pitch) in an analysis window in 
order to deduce from this the fundamental frequency. Preferably, this function may be utilized to 
process samples in order to reveal the periods. In general, it is not feasible to count the peaks in 
the window because the signal tends to vary with time (for example, the beating of 1 to 3 piano 
strings that are not necessarily perfectly in tune); moreover, in the same period, there can be more 
than one peak. In accordance with such embodiments, the distance between a reference 
considered at the beginning of the analysis window and each of the panes shifted by one sample. 
For a window of 2*WINDOW_SIZE samples and a reference window of WINDOW_SIZE 
samples, one therefore may therefore carry out WINDOW_SIZE computations of distance on 
WINDOW_SIZE samples. Preferably, the computation of distance is done by a sum of the 
absolute value of the differences between reference samples and analysis samples. This function 
preferably is put in internal RAM in order to be executed, for example, 3 times more rapidly (one 
clock cycle for each instruction instead of three in external flash memory). 

Also in accordance with such embodiments, special effects such as wobbler, flange, echo 
and reverb may be provided with the ADPCM encoding. Such special effects preferably are 
produced over 256 samples coming from the ADPCM decoder and from the doppler effect. 
Preferably, this function is put in internal RAM in order to be executed, for example, 3 times 
more rapidly (one clock cycle for each instruction instead of three in external flash memory). 
Preferably, the average value of the sample is computed, and it is subtracted from the sample 



75 



(which can be present over the samples) in order to avoid executing the wobbler function on it, 
which would add the modulation frequency in the signal (and tend to produce an unpleasant 
hiss). Preferably, the method for the wobbler effect is a frequency modulation based on sample = 
sample multiplied by a sine function (based on suitable wobbler frequencies, as will be 
understood by those of skill in the art). 

Also in accordance with the preferred embodiments, the purpose of the flange effect is to 
simulate the impression that more than one person is speaking or singing with a single source 
voice. In order to limit the computation power, two voices preferably are simulated. In order to 
provide this impression, preferably the pitch of the source voice is changed and added to the 
original source voice. The most accurate method would be to analyze the voice using a vocoder 
and then to change the pitch without changing the speed. In each case, one could have the 
impression that a man and a woman are singing together, although such a method typically would 
require DSP resources. A method that changes the pitch without changing the speed (important 
if one wants the voices to remain synchronous) consists of simulating the second voice by 
alternately accelerating and decelerating the samples. One then produces the doppler effect 
explained in the preceding, but with a doppler that varies alternately around zero in such a way as 
to have a slightly different pitch and the voices synchronous. With such embodiments, one may 
simulate, for example, a person placed on a circle approximately 4 meters in diameter regularly 
turning around its axis and placed beside another stationary person. 

Also in accordance with such embodiments, the echo effect is the sum of a source sample 
and of a delayed sample, and the reverb effect is the sum of a source sample and a delayed 
sample affected by a gain factor. The delayed samples preferably may be put in a circular buffer 
and are those resulting from the sum. The formula of the reverb effect may therefore be: 

Sample(O) = sample(O) + sample(-n)*gain + sample(-2*n)*gain A 2 + sample (-3*n)*gain A 
+ . . . + sample(-i*n)*gain A i. Preferably, the gain is chosen to be less than 1 in order to avoid a 
divergence. In accordance with preferred embodiments, for reasons of size of the buffer, which 
can be considerable, the echo effect preferably uses the same buffer as that of the reverb effect. In 
order to have a true echo, it is necessary to give reverb a gain effect that is zero or low. The two 
effects can function at the same time. The delay between a new sample and an old one is 
produced by reading the oldest sample put in the memory buffer. In order to avoid shifting the 
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buffer for each new sample, the reading pointer of the buffer is incremented by limiting this 
pointer between the boundaries of the buffer. The size of the memory buffer therefore depends 
on the time between samples. 

Also in accordance with such embodiments, an electronic tuner function may be 
provided, the aim of which is to find the fundamental of the sample signal coming from the 
microphone in order to give the note played by a musical instrument. Similar to what has been 
described previously, a preferred method will consist of computing the number of periods for a 
given time that is a multiple of the period in order to increase the accuracy of computation of the 
period. In effect, a single period will give little accuracy if the value of this period is poor 
because of the sampling. In order to detect the periods, preferably one uses a routine which 
computes the distance between a reference taken at the beginning of the signal and the signal As 
will be understood, the period will be the position of the last period divided by the total number 
of periods between the first and the last period. The effective position of the last period is 
computed by an interpolation of the true maximum between two distance samples. The period 
thus computed will give by inversion (using a division of 64 bits/3 2bits) the fundamental 
frequency with great precision (better than 1/4000 for a signal without noise, which is often the 
case). 

Also in accordance with such embodiments, a low pass filter (or other filter) function may 
be provided as part of the effects provided with the ADPCM sample coding. Such a function 
may eliminate with a low-pass filter the high frequencies of the samples used for computation of 
the distance such for the routines previously described. These high frequencies tend to disturb 
the computations if they are too elevated. Filtering is done by looking for the highest value in 
order to normalize the buffer used for computation of the distance. 

Also in accordance with the present invention, there are numerous additional 
implementations and variations that preferably can be used with many desirable aspects of the 
present invention. Exemplary ways to use the present invention to great effect include a 
software-based approach, as well as general integration with other products. Additionally, 
several valuable variations to the present invention can be used with great success, especially 
with regard to media content management, integration with video, and other miscellaneous 
variations. 
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Many aspects of the present invention can be incorporated with success into a software- 
based approach. For example, the hardware DSP of the above discussion can be substituted with 
a software synthesizer to perform signal processing functions (the use of a hardware-based 
synthesizer is not a requirement of the present invention). Such an approach preferably will take 
advantage of the excess processing power of, for example, a contemporary personal computer, 
and preferably will provide the quality of the music produced in a hardware-based device, while 
also providing greater compatibility across multiple platforms (e.g., it is easier to share a song 
that can be played on any PC). Configuring certain embodiments of the present invention into a 
software-based approach enables additional variations, such as a self-contained application 
geared toward a professional music creator, or alternatively geared towards an armchair music 
enthusiast. Additionally, it is preferable to configure a software-based embodiment of the 
present invention for use in a website (e.g., a Java language applet), with user preferences and/or 
customizations to be stored in local files on the user's computer (e.g., cookies). Such an 
approach preferably enables a user to indicate a music accompaniment style preference that will 
' stick 7 and remain on subsequent visits to the site. Variations of a software-based approach 
preferably involve a 'software plug-in' approach to an existing content generation software 
application (such as Macromedia Flash, Adobe Acrobat, Macromedia Authorware, Microsoft 
PowerPoint, and/or Adobe AfterEffects). It is useful to note that such a plug-in can benefit from 
the potentially royalty free music, and that in certain embodiments, it may be preferable to export 
an interactively generated musical piece into a streaming media format (e.g., ASF) for inclusion 
in a Flash presentation, a PDF file, an Authorware presentation, an AfterEffects movie, etc. 
Certain embodiments of the present invention can be involved in a Internet-based arrangement 
that enables a plurality of users to interactively generate music together in a cooperative sense, 
preferably in real time. Aspects of the present invention involving customized music can be 
incorporated as part of music games (and/or music learning aids), news sources (e.g., internet 
news sites), language games (and/or language learning aids), etc. Additionally, a 
software/hardware hybrid approach incorporating many features and benefits of the present 
invention can involve a hybrid "DSP" module that plugs into a high speed bus (e.g., IEEE 1394, 
or USB, etc.) of a personal computing system. In such an approach, the functionality of MP 36 
can be performed by a personal computing system, while the functionality of DSP 42 can be 
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performed by a DSP located on a hardware module attached to a peripheral bus such as USB. 
Following this example, a small USB module about the size of a automobile key can be plugged 
into the USB port of a PC system, and can be used to perform the hardware DSP functions 
associated with the interactive auto-generation of algorithmic music. 

As will be appreciated, aspects of the present invention may be incorporated into a variety 
of systems and applications, an example of which may be a PBX or other telephone type system. 
An exemplary system is disclosed in, for example, USP 6,289,025 to Pang et al., which is hereby 
incorporated by reference (other exemplary systems include PBX systems from companies such 
as Alcatel, Ericsson, Nortel, Avaya and the like). As will be appreciated from such an exemplary 
system, a plurality of telephones and telephony interfaces may be provided with the system, and 
users at the facility in which the system is located, or users who access the system externally 
(such as via a POTS telephone line or other telephone line), may have calls that are received by 
the system. Such calls may be directed by the system to particular users, or alternatively the calls 
may be placed on hold (such aspects of such an exemplary system are conventional and will not 
be described in greater detail herein). Typically, on-hold music is provided to callers placed on 
hold, with the on-hold music consisting of a radio station or taped or other recorded music 
coupled through an audio input, typically processed with a coder and provided as an audio stream 
(such as PCM) and coupled to the telephone of the caller on hold. 

In accordance with embodiments of the present invention, however, one or more modules 
are provided in the exemplary system to provide on-hold music to the caller on hold. Such a 
module, for example, could include the required constituent hardware/software components of a 
Player as described elsewhere herein (see, e.g., Fig. 32 and related description) (for purposes of 
this discussion such constituent hardware/software components are referred to as an "auto- 
composition engine"), but with the user interface adapted for the PBX-type of environment. In 
one such exemplary embodiment, one or more auto-composition engines are provided, which 
serve to provide the on-hold music to one or more callers on hold. In one example, a single auto- 
composition engine is provided, and the first caller on hold may initially be presented with auto- 
composed music of a particular style as determined by the auto-composition engine (or processor 
controlling the exemplary system) (this may also be a default on hold music style selected by a 
configuration parameter of the exemplary system). Preferably, via an audio prompt provided by 
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the resources of the exemplary system, the caller on hold is provided with audio information 
indicating that the caller on hold may change the style of on-hold music being provided (such 
audio prompt generation is considered conventional in the context of such exemplary systems 
and will not be described in greater detail herein). Preferably, the user may indicate such desire 
5 by pressing a predetermined digit (which preferably is identified in the audio prompt) on the 
telephone key pad, which may be detected by the resources of the exemplary system (such digit 
detection capability is considered conventional in the context of such exemplary systems and will 
not be described in greater detail herein), and thereafter may be provided with preferably a 
plurality of music styles from which to select the style of on-hold music (such as with audio 
10 prompts providing available styles of music followed by one or more digits to be entered to select 

the desired style of music). Thereafter, the user may depress the appropriate digit(s) on the 
O telephone keypad, which are detected by the resources of the exemplary system, which preferably 
lij decodes the digits and sends control information to one of the auto-composition engines, in 
r % response to which the auto-composition engine thereafter begins to auto-compose music of the 
I! selected style, which is directed to the caller on hold as on hold music. 
g What is important is that, in accordance with such embodiments, one or more auto- 

2 composition engines are adapted for the exemplary system, with the command/control interface 
y of the auto-composition engine being changes from buttons and the like to commands from the 
p resources of the exemplary system (which are generated in response to calls being placed on 
2(j hold, digit detection and the like). In accordance with variations of such embodiments, a 

plurality of auto-composition engines are provided, and the resources of the system selectively 
provide on-hold music to on hold callers of a style selected by the caller on hold (such as 
described above). In one variation, there may potentially be more callers on hold than there are 
auto-composition engines; in such embodiments, the callers on hold are selectively coupled to 
25 one of the output audio streams of the auto-composition engines provided that there is at least 
one auto-composition engine that is not being utilized. If a caller is place on hold at a time when 
all of the auto-composition engines are being utilized, the caller placed on hold is either coupled 
to one of the audio streams being output by one of the auto-composition engines (without being 
given a choice), or alternatively is provided with an audio prompt informing the user of the styles 
30 of on-hold music that are currently being offered by the auto-composition engines (in response 
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thereto, this caller on hold may select one of the styles being offered by depressed one or more 
digits on the telephone keypad and be coupled to an audio stream that is providing auto- 
composed music of the selected style). 

Other variations of such embodiments include: (1) the resources of the exemplary system 
5 detect, such as via caller ED information or incoming trunk group of the incoming call, 

information regarding the calling party (such as geographic location), and thereafter directs that 
the on hold music for the particular on hold be a predetermined style corresponding to the caller 
ED information or trunk group information, etc.; (2) the resources of the exemplary system 
selectively determines the style of the on-hold music based on the identity of the called party 
10 (particular called parties may, for example, set a configuration parameter that directs that their on 
jM hold music be of a particular style); (3) the resources of the exemplary system may selectively 
f=j determine the style of on-hold music by season of the year, time of day or week, etc.; (4) the 
' exemplary system includes an auto-composition engine for each of the styles being offered, 
=p thereby ensuring that all callers on-hold can select one of the styles that are offered; (5) default or 
*f initial music styles (such as determined by the resources of the exemplary system or called party, 
p n etc., as described above) are followed by audio prompts that enable the caller on hold to change 

the music style; and (6) the resources of the exemplary system further provide audio prompts that 
5 p enable a user to select particular music styles and also parameters that may be changed for the 
music being auto-composed in the particular music style (in essence, audio prompt generation 
20 and digit detection is provided by the resources of the exemplary system to enable the caller on 
hold to alter parameters of the music being auto-composed, such as described elsewhere herein. 

Other examples of novel ways to generally integrate aspects of the present invention with 
other products include: video camera (e.g., preferably to enable a user to easily create home 
movies with a royalty free, configurable soundtrack), conventional stereo equipment, exercise 
25 equipment (speed/intensity/style programmable, preferably similar to workout-intensity- 
programmable capabilities of the workout device, such as a StairMaster series of hills), 
configurable audio accompaniment to a computer Screensaver program, and configurable audio 
accompaniment to an information kiosk system. 

Aspects of the present invention can advantageously be employed in combination with 
30 audio watermarking techniques that can embed (and/or detect) an audio * fingerprint' on the 
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musical output to facilitate media content rights management, etc. The preferable incorporation 
of audio watermarking techniques, such as those described by Verance or Digimarc (e.g., the 
audio watermarking concepts described by Digimarc in US patents 6,289,108 and 6,122,392, 
incorporated herein by reference), can enable a user with the ability to monitor the subsequent 
usage of their generated music. 

In another example, certain embodiments of the present invention can be incorporated as 
part of the software of video game (such as a PlayStation 2 video game) to provide music that 
preferably virtually never repeats, as well as different styles preferably selectable by the user 
and/or selectable by the video game software depending on action and/or plot development of the 
game itself. 

Additionally, there are certain novel variations to the present invention that incorporate 
many advantages of the present invention to great effect. For example, in the portable hardware 
device 35 in Fig. 32, the incoming data on MIC input 51 (e.g., a vocal melody of the user) can 
pass through hardware codec 52 to MP 36, where it can be analyzed by the MP 36 and 
processed/adjusted by DSP 42 (under control of MP 36) to subtly 'improve' pitch and/or rhythm 
characteristics. This example illustrates a preferable arrangement that allows a user's vocal input 
to be adjusted to conform to the key and/or rhythmic characteristics of the accompanying music. 
Continuing this example, the pitch of a user's input to MIC input 51 preferably can be analyzed 
by the portable hardware device 35 and bumped up or down in pitch to more closely match a 
pitch that fits the current key and/or mode of the music. Such a variation provides a novice user 
with a easy way to generate songs that are musically compelling, yet preferably are also 
noticeably derivative of the user's input (e.g., vocal). In another example variation, the circuitry 
mentioned here preferably can be available to analyze the user's input (e.g., vocal) and infer 
some type of timing and/or melody information, which information preferably can then be used 
in the interactive music autogeneration to help define the pitch values and/or the rhythmic data 
comprised in the RP. This example presents a way for a user to demonstrably interact with, and 
influence, the musical output, all the while without needing to fully understand the complexities 
of musical composition. 

Additionally, many aspects of the present invention are useful to enable a new concept in 
Firmware upgrades. Using aspects of the present invention, firmware updates can be made 



82 



available to users, complete with embedded advertising, which provides the Firmware 
manufactures/distributors with a revenue source other than the user. This concept preferably 
involves the distribution of firmware (or other software-based programs such as sound bank data) 
upgrades that contain embedded advertising images (and/or sounds). Such images/sounds 
5 preferably can temporarily appear during the operation of the music product, and can fund the 
development of customized firmware for users to preferably freely download. 

As will be understood by a person of ordinary skill in the art of portable electronic music 
design, the examples discussed here are representative of the full spirit and scope of the present 
invention. Additional variations, some of which are described here, incorporate many aspects of 
10 the present invention. 

|=i Although the invention has been described in conjunction with specific preferred and 

f=% other embodiments, it is evident that many substitutions, alternatives and variations will be 
41 apparent to those skilled in the art in light of the foregoing description. Accordingly, the 
*p invention is intended to embrace all of the alternatives and variations that fall within the spirit 
|| and scope of the appended claims. For example, it should be understood that, in accordance with 
JL the various alternative embodiments described herein, various systems, and uses and methods 
H based on such systems, may be obtained. The various refinements and alternative and additional 
1; features also described may be combined to provide additional advantageous combinations and 
lf t the like in accordance with the present invention. Also as will be understood by those skilled in 
20 the art based on the foregoing description, various aspects of the preferred embodiments may be 
used in various subcombinations to achieve at least certain of the benefits and attributes 
described herein, and such subcombinations also are within the scope of the present invention. 
All such refinements, enhancements and further uses of the present invention are within the 
scope of the present invention. 
25 To further facilitate the various embodiments of the present invention, and the novel uses 

and applications thereof (including starting and using a particular implementation of a Player 
known as the "MadPlayer"), the following documents are attached hereto and incorporated herein 
by reference as part of this application and invention: MadPlayer MadManual (76 pages); 
MadPlayer Quickstart and Setup Guide (3 pages); MadWare Reference (26 pages); and Custom 
30 Graphics Specification (13 pages). 
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